Issue
I need to download .txt files which are generated from links like this one: https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0 but I need to download it in the bash shell. It works perfectly fine on Firefox, on the shell I tried wget and curl to no avail. I read lots of similar question in Stack Overflow and other pages, tried everything I could find, but couldn't find a solution. For example:
curl https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0
This is the output, and no file is downloaded:
[1] 1094
[2] 1095
[3] 1096
[4] 1097
[5] 1098
[2] Done result=read_run
[3] Done fields=fastq_ftp
[4]- Done format=tsv
(base) user@DESKTOP-LV4SKHQ:/mnt/c/Users/conog/Desktop/prova$ curl: (6) Could not resolve host: www.ebi.ac.uk
[1]- Exit 6 curl https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480
[5]+ Done download=true
Another example, after I read a couple of posts here:
curl -O -L https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0
[1] 1056
[2] 1057
[3] 1058
[4] 1059
[5] 1060
[2] Done result=read_run
[3] Done fields=fastq_ftp
[4] Done format=tsv
[5]+ Done download=true
(base) gsoletta@DESKTOP-LV4SKHQ:/mnt/c/Users/conog/Desktop/prova$ % Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 49 100 49 0 0 68 0 --:--:-- --:--:-- --:--:-- 67
[1]+ Done
this last one downloads a 49 byte file with no extension, called filereportaccession=SRP002480, with the content: "Required String parameter 'result' is not present".
I'll also add I'm a novice at bash. What could I do?
Thank you!
Solution
It works for me:
$ curl -s 'https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0'
run_accession fastq_ftp
SRR1620013 ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/003/SRR1620013/SRR1620013_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/003/SRR1620013/SRR1620013_2.fastq.gz
SRR1620014 ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/004/SRR1620014/SRR1620014_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR162/004/SRR1620014/SRR1620014_2.fastq.gz
...
$ wget -O filereport.tsv 'https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0'
--2021-11-15 17:51:48-- https://www.ebi.ac.uk/ena/portal/api/filereport?accession=SRP002480&result=read_run&fields=fastq_ftp&format=tsv&download=true&limit=0
Resolving www.ebi.ac.uk (www.ebi.ac.uk)... 193.62.193.80
Connecting to www.ebi.ac.uk (www.ebi.ac.uk)|193.62.193.80|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/plain]
Saving to: ‘filereport.tsv’
...
2021-11-15 17:51:51 (831 KB/s) - ‘filereport.tsv’ saved [675136]
Your problem is that you didn't put quotes around the URL. When you don't quote the URL the &
s in it cause each URL parameter to be interpreted as a separate command by bash.
Answered By - Stephen Ostermiller