I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
Need help to unzip files in Windows
9 messages · Anas Jamshed, Andrew Simmons, Abby Spurdle +1 more
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
I am trying this URL: " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " but it is not giving me any file
On Mon, Aug 23, 2021 at 11:42 PM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello, I tried downloading that file using 'utils::download.file' (which worked), but then continued to complain about "damaged archive" when trying to use 'utils::untar'. However, it seemed to work when I downloaded the archive manually. Finally, the solution I found is that you have to specify the mode in which you're downloading the file. Something like: URL <- " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE, mode = "wb") utils::untar(FILE, exdir = dirname(FILE)) worked perfectly for me. It seems to also work still on Ubuntu, but you can let us know if you find it doesn't. I hope this helps! On Mon, Aug 23, 2021 at 3:20 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I am trying this URL: " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " but it is not giving me any file On Mon, Aug 23, 2021 at 11:42 PM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
There are some differences in R, between Windows and Linux. You could try the 'shell' command instead. #On Windows ?shell
On Tue, Aug 24, 2021 at 4:53 AM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
sir after that I want to run:
#get the list of sample names
GSMnames <- t(list.files("~/Desktop/GSE162562_RAW", full.names = F))
#remove .txt from file/sample names
GSMnames <- gsub(pattern = ".txt", replacement = "", GSMnames)
#make a vector of the list of files to aggregate
files <- list.files("~/Desktop/GSE162562_RAW", full.names = TRUE)
but it is not running as after running utils::untar(FILE, exdir =
dirname(FILE)) it creates another 108 archieves
On Tue, Aug 24, 2021 at 2:03 AM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I tried downloading that file using 'utils::download.file' (which worked), but then continued to complain about "damaged archive" when trying to use 'utils::untar'. However, it seemed to work when I downloaded the archive manually. Finally, the solution I found is that you have to specify the mode in which you're downloading the file. Something like: URL <- " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE, mode = "wb") utils::untar(FILE, exdir = dirname(FILE)) worked perfectly for me. It seems to also work still on Ubuntu, but you can let us know if you find it doesn't. I hope this helps! On Mon, Aug 23, 2021 at 3:20 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I am trying this URL: " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " but it is not giving me any file On Mon, Aug 23, 2021 at 11:42 PM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Hello,
I see what you're saying that the .tar archive contains many more
compressed files, but that's not necessarily a problem. R can read directly
from a compressed file without having to decompress it beforehand. I
modified my code to look a little more like yours:
# need to do 'path.expand' or 'untar' will fail
# this is where we put the downloaded files
exdir <- path.expand("~/GSE162562_RAW")
dir.create(exdir, showWarnings = FALSE)
URL <- "
https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar
"
FILE <- file.path(tempdir(), basename(URL))
utils::download.file(URL, FILE, mode = "wb")
utils::untar(FILE, exdir = exdir)
unlink(FILE, recursive = TRUE, force = TRUE)
# 'files' is the full path to the downloaded files
# attribute 'names' is the basename with '.txt.gz' removed from the end
files <- list.files(exdir, full.names = TRUE)
names(files) <- sub("\\.txt\\.gz$", "", basename(files))
# R can open compressed files without decompressing beforehand
print(utils::read.table(files[[1]], sep = "\t"))
print(utils::read.delim(files[[2]], header = FALSE))
Does this work better than before for you?
On Mon, Aug 23, 2021 at 8:16 PM Anas Jamshed <anasjamshed1994 at gmail.com>
wrote:
sir after that I want to run:
#get the list of sample names
GSMnames <- t(list.files("~/Desktop/GSE162562_RAW", full.names = F))
#remove .txt from file/sample names
GSMnames <- gsub(pattern = ".txt", replacement = "", GSMnames)
#make a vector of the list of files to aggregate
files <- list.files("~/Desktop/GSE162562_RAW", full.names = TRUE)
but it is not running as after running utils::untar(FILE, exdir =
dirname(FILE)) it creates another 108 archieves
On Tue, Aug 24, 2021 at 2:03 AM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I tried downloading that file using 'utils::download.file' (which worked), but then continued to complain about "damaged archive" when trying to use 'utils::untar'. However, it seemed to work when I downloaded the archive manually. Finally, the solution I found is that you have to specify the mode in which you're downloading the file. Something like: URL <- " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE, mode = "wb") utils::untar(FILE, exdir = dirname(FILE)) worked perfectly for me. It seems to also work still on Ubuntu, but you can let us know if you find it doesn't. I hope this helps! On Mon, Aug 23, 2021 at 3:20 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I am trying this URL: " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " but it is not giving me any file On Mon, Aug 23, 2021 at 11:42 PM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed < anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
but the point is that where should I start from now
On Tue, Aug 24, 2021 at 7:43 AM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello,
I see what you're saying that the .tar archive contains many more
compressed files, but that's not necessarily a problem. R can read directly
from a compressed file without having to decompress it beforehand. I
modified my code to look a little more like yours:
# need to do 'path.expand' or 'untar' will fail
# this is where we put the downloaded files
exdir <- path.expand("~/GSE162562_RAW")
dir.create(exdir, showWarnings = FALSE)
URL <- "
https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar
"
FILE <- file.path(tempdir(), basename(URL))
utils::download.file(URL, FILE, mode = "wb")
utils::untar(FILE, exdir = exdir)
unlink(FILE, recursive = TRUE, force = TRUE)
# 'files' is the full path to the downloaded files
# attribute 'names' is the basename with '.txt.gz' removed from the end
files <- list.files(exdir, full.names = TRUE)
names(files) <- sub("\\.txt\\.gz$", "", basename(files))
# R can open compressed files without decompressing beforehand
print(utils::read.table(files[[1]], sep = "\t"))
print(utils::read.delim(files[[2]], header = FALSE))
Does this work better than before for you?
On Mon, Aug 23, 2021 at 8:16 PM Anas Jamshed <anasjamshed1994 at gmail.com>
wrote:
sir after that I want to run:
#get the list of sample names
GSMnames <- t(list.files("~/Desktop/GSE162562_RAW", full.names = F))
#remove .txt from file/sample names
GSMnames <- gsub(pattern = ".txt", replacement = "", GSMnames)
#make a vector of the list of files to aggregate
files <- list.files("~/Desktop/GSE162562_RAW", full.names = TRUE)
but it is not running as after running utils::untar(FILE, exdir =
dirname(FILE)) it creates another 108 archieves
On Tue, Aug 24, 2021 at 2:03 AM Andrew Simmons <akwsimmo at gmail.com>
wrote:
Hello, I tried downloading that file using 'utils::download.file' (which worked), but then continued to complain about "damaged archive" when trying to use 'utils::untar'. However, it seemed to work when I downloaded the archive manually. Finally, the solution I found is that you have to specify the mode in which you're downloading the file. Something like: URL <- " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE, mode = "wb") utils::untar(FILE, exdir = dirname(FILE)) worked perfectly for me. It seems to also work still on Ubuntu, but you can let us know if you find it doesn't. I hope this helps! On Mon, Aug 23, 2021 at 3:20 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I am trying this URL: " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " but it is not giving me any file On Mon, Aug 23, 2021 at 11:42 PM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed < anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
1 day later
Hello,
Are you looking for what follows Andrew's code below to download and
untar the files?
read_one_gz_file <- function(x, path){
fl <- file.path(path, x)
tryCatch({
read.table(zz <- gzfile(fl))
},
warning = function(w) w,
error = function(e) e
)
}
URL <-
"https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar"
FILE <- file.path(tempdir(), basename(URL))
utils::download.file(URL, FILE, mode = "wb")
utils::untar(FILE, exdir = dirname(FILE))
fls <- list.files(path = dirname(FILE), pattern = "\\.gz$")
length(fls)
#[1] 108
data_list <- lapply(fls, read_one_gz_file, path = dirname(FILE))
length(data_list)
#[1] 108
head(data_list[[1]])
# V1 V2
#1 A1BG 4
#2 A1BG-AS1 52
#3 A1CF 12
#4 A2M 645
#5 A2M-AS1 113
#6 A2ML1 21
I don't understand what you mean by to aggregate the files but if you
want them all in one df, maybe this will do it.
sapply(data_list, ncol) # All files have 2 columns
# create a column with the original dataset name
data_list <- lapply(seq_along(data_list), function(i){
dftmp <- data_list[[i]]
dftmp$dataset <- sub("\\.txt\\.gz$", "", fls[i])
dftmp
})
# put all data sets in one data.frame
df1 <- do.call(rbind, data_list)
dim(df1) # Over 2.8 million rows, 3 columns
head(df1) # see the first 6 rows
# V1 V2 dataset
#1 A1BG 4 GSM4954457_A_1_Asymptom
#2 A1BG-AS1 52 GSM4954457_A_1_Asymptom
#3 A1CF 12 GSM4954457_A_1_Asymptom
#4 A2M 645 GSM4954457_A_1_Asymptom
#5 A2M-AS1 113 GSM4954457_A_1_Asymptom
#6 A2ML1 21 GSM4954457_A_1_Asymptom
Hope this helps,
Rui Barradas
?s 01:16 de 24/08/21, Anas Jamshed escreveu:
sir after that I want to run:
#get the list of sample names
GSMnames <- t(list.files("~/Desktop/GSE162562_RAW", full.names = F))
#remove .txt from file/sample names
GSMnames <- gsub(pattern = ".txt", replacement = "", GSMnames)
#make a vector of the list of files to aggregate
files <- list.files("~/Desktop/GSE162562_RAW", full.names = TRUE)
but it is not running as after running utils::untar(FILE, exdir =
dirname(FILE)) it creates another 108 archieves
On Tue, Aug 24, 2021 at 2:03 AM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I tried downloading that file using 'utils::download.file' (which worked), but then continued to complain about "damaged archive" when trying to use 'utils::untar'. However, it seemed to work when I downloaded the archive manually. Finally, the solution I found is that you have to specify the mode in which you're downloading the file. Something like: URL <- " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE, mode = "wb") utils::untar(FILE, exdir = dirname(FILE)) worked perfectly for me. It seems to also work still on Ubuntu, but you can let us know if you find it doesn't. I hope this helps! On Mon, Aug 23, 2021 at 3:20 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I am trying this URL: " https://ftp.ncbi.nlm.nih.gov/geo/series/GSE162nnn/GSE162562/suppl/GSE162562_RAW.tar " but it is not giving me any file On Mon, Aug 23, 2021 at 11:42 PM Andrew Simmons <akwsimmo at gmail.com> wrote:
Hello, I don't think you need to use a system command directly, I think 'utils::untar' is all you need. I tried the same thing myself, something like: URL <- "https://exiftool.org/Image-ExifTool-12.30.tar.gz" FILE <- file.path(tempdir(), basename(URL)) utils::download.file(URL, FILE) utils::untar(FILE, exdir = dirname(FILE)) and it makes a folder "Image-ExifTool-12.30". It seems to work perfectly fine in Windows 10 x64 build 19042. Can you send the specific file (or provide a URL to the specific file) that isn't working for you? On Mon, Aug 23, 2021 at 12:53 PM Anas Jamshed <anasjamshed1994 at gmail.com> wrote:
I have the file GSE162562_RAW. First I untar them
by untar("GSE162562_RAW.tar")
then I am running like:
system("gunzip ~/Desktop/GSE162562_RAW/*.gz")
This is running fine in Linux but not in windows. What changes I
should make to run this command in windows as well
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[[alternative HTML version deleted]]
______________________________________________ R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.