how do you convert a fasta file into an sql table?

how do you convert a fasta file into an sql table? - sql

I have a fasta file temp_mart.txt as such:
ENSG00000100219|ENST00000005082
MTLLTFRDVAIEFSLEEWKCLDLAQQNLYRDVMLENYRNLFSVGLTVCKPGL
And I tried to load it into a table in sql using
load data local infile '~/Desktop/temp_mart.txt' into table mart
But instead of getting an output like this:
+-----------------+------+------+
| ENSG | ENST | 3UTR |
+-----------------+------+------+
| >ENSG0000010021 | ENST00000005082 | MTLLTFRDVAIEFSL |
I get this:
+-----------------+------+------+
| ENSG | ENST | 3UTR |
+-----------------+------+------+
| >ENSG0000010021 | NULL | NULL |
| MTLLTFRDVAIEFSL | NULL | NULL |
| EPWNVKRQEAADGHP | NULL | NULL |
| DKFTAMSSHFTQDLL | NULL | NULL |
Everything seems to go into the first column. What is the best way to load it into the table so it presents as intended? Do I need to convert it into a csv file first?

Related

Postgresql query substract from one table

I have a one tables in Postgresql and cannot find how to build a query.
The table contains columns nr_serii and deleteing_time. I trying to count nr_serii and substract from this positions with deleting_time.
My query:
select nr_serii , count(nr_serii ) as ilosc,count(deleting_time) as ilosc_delete
from MyTable
group by nr_serii, deleting_time
output is:
+--------------------+
| "666666";1;1 |
| "456456";1;0 |
| "333333";3;0 |
| "333333";1;1 |
| "111111";1;1 |
| "111111";3;0 |
+--------------------+
The part of table with raw data:
+--------------------------------+
| "666666";"2020-11-20 14:08:13" |
| "456456";"" |
| "333333";"" |
| "333333";"" |
| "333333";"" |
| "333333";"2020-11-20 14:02:23" |
| "111111";"" |
| "111111";"" |
| "111111";"2020-11-20 14:08:04" |
| "111111";"" |
+--------------------------------+
And i need substract column ilosc and column ilosc_delete
example:
nr_serii:333333 ilosc:3-1=2
Expected output:
+-------------+
| "666666";-1 |
| "456456";1 |
| "333333";2 |
| "111111";2 |
| ... |
+-------------+
I think this is very simple solution for this but i have empty in my head.

I see what you want now. You want to subtract the number where deleting_time is not null from the ones where it is null:
select nr_serii,
count(*) filter (where deleting_time is null) - count(deleting_time) as ilosc_delete
from MyTable
group by nr_serii;
Here is a db<>fiddle.

SQLITE: How to select a column value based on different columns in another table

I apologise in advance because I have no idea how to structure this question.
I have the following tables:
Sessions:
+----------+---------+
| login | host |
+----------+---------+
| breilly | node001 |
+----------+---------+
| pparker | node003 |
+----------+---------+
| jjameson | node004 |
+----------+---------+
| jjameson | node012 |
+----------+---------+
Userlist:
+----------+----------------+------------------+
| login | primary_server | secondary_server |
+----------+----------------+------------------+
| breilly | node001 | node010 |
+----------+----------------+------------------+
| pparker | node002 | node003 |
+----------+----------------+------------------+
| jjameson | node003 | node004 |
+----------+----------------+------------------+
What kind of SQL query should I perform so I can get a table like this?:
+----------+---------+------------+
| login | Host | Server |
+----------+---------+------------+
| jjameson | node004 | Secondary |
+----------+---------+------------+
| jjameson | node012 | Wrong Node |
+----------+---------+------------+
| pparker | node003 | Secondary |
+----------+---------+------------+
| breilly | node001 | Primary |
+----------+---------+------------+
Currently I'm just using Go with a bunch of structs / hashmaps to generate this.
I am planning to migrate the users / sessions to an in memory sqlite Database, but I can't seem to wrap my head around a query to get this sort of table.
The Server column is based on whether the user is logged on his primary / secondary or wrong machine.
I've put this in SQL Fiddle as well

Use case logic:
select s.*,
(case when s.host = ul.primary_server then 'primary'
when s.host = ul.secondary_server then 'secondary'
else 'wrong node'
end) as server
from sessions s left join
userlist ul
on s.login = ul.login;

Get similar employees based on their attribute values

Consider the following sample table("Customer") with these records
=========
Customer
=========
-----------------------------------------------------------------------------------------------
| customer-id | att-a | att-b | att-c | att-d | att-e | att-f | att-g | att-h | att-i | att-j |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-1 | att-a-7 | att-b-3 | att-c-10 | att-d-10 | att-e-15 | att-f-11 | att-g-2 | att-h-7 | att-i-5 | att-j-14 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-2 | att-a-9 | att-b-7 | att-c-12 | att-d-4 | att-e-10 | att-f-4 | att-g-13 | att-h-4 | att-i-1 | att-j-13 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-3 | att-a-10 | att-b-6 | att-c-1 | att-d-1 | att-e-13 | att-f-12 | att-g-9 | att-h-6 | att-i-7 | tt-j-4 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
| customer-19 | att-a-7 | att-b-9 | att-c-13 | att-d-5 | att-e-8 | att-f-5 | att-g-12 | att-h-14 | att-i-13 | att-j-15 |
--------------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+
I have these records and many more records dumped into SQL database and wanted to find top 10 similar customer based on the attribute value. For example customer-1 and customer-19 have atleast one column value matching .i.e "att-a-7" so the output should give me 2 customer-id's or top similar customer that are customer-1 and customer-19.
P.S - there can be one or more columns similar across rows.
I'm using windowing technique to find top 10 similar customer and im not sure if I'm correct.
following is my approach I used in my query :
row_number() over (partition by att-a, att-b,..,att-j order by customer-id) as customers
is this correct. ?

Query from secondary index on aerospike

I'm considering aerospike for one of our projects. So I currently created a 3 node cluster and loaded some data on it.
Sample data
ns: imei
set: imei_data
+-------------------+-----------------------+-----------------------+----------------------------+--------------+--------------+
| imsi | fcheck | lcheck | msc | fcheck_epoch | lcheck_epoch |
+-------------------+-----------------------+-----------------------+----------------------------+--------------+--------------+
| "413010324064956" | "2017-03-01 14:30:26" | "2017-03-01 14:35:30" | "13d20b080011044917004100" | 1488358826 | 1488359130 |
| "413012628090023" | "2016-09-21 10:06:49" | "2017-09-16 13:54:40" | "13dc0b080011044917006100" | 1474432609 | 1505550280 |
| "413010130130320" | "2016-12-29 22:05:07" | "2017-10-09 16:17:10" | "13d20b080011044917003100" | 1483029307 | 1507546030 |
| "413011330114274" | "2016-09-06 01:48:06" | "2017-10-09 11:53:41" | "13d20b080011044917003100" | 1473106686 | 1507530221 |
| "413012629781993" | "2017-08-16 16:03:01" | "2017-09-13 18:10:48" | "13dc0b080011044917004100" | 1502879581 | 1505306448 |
Then I created a secondary index on lcheck_epoch using AQL since I want to query based on date.
create index idx_lcheck on imei.imei_data (lcheck_epoch) NUMERIC
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
| ns | bin | indextype | set | state | indexname | path | type |
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
| "imei" | "lcheck_epoch" | "NONE" | "imei_data" | "RW" | "idx_lcheck" | "lcheck_epoch" | "NUMERIC" |
+--------+----------------+-----------+-------------+-------+--------------+----------------+-----------+
When I execute
select imsi from imei.imei_data where idx_lcheck=1476165806
I'm getting
Error: (204) AEROSPIKE_ERR_INDEX
Please explain.

You're using the index name, not the bin name, in your query. Try this:
SELECT imsi FROM imei.imei_data WHERE lcheck_epoch=1476165806
Or
SELECT imsi FROM imei.imei_data WHERE lcheck_epoch BETWEEN 1490000000 AND 1510000000
Just a note, you can do much more complex queries using predicate filtering through several of the language clients (Java, C, C#, Go). For example the PredExp class of the Java client (see examples.)

Joining two tables and show data from one if there is any

I have these two tables that i need to join
fields_data fields
+------------+-----------+------+ +------+-------------+----------+
| relationid | fieldname | data | | name | displayname | position |
+------------+-----------+------+ +------+-------------+----------+
| 2 | ftp | test | | user | Username | top |
| 2 | other | 1234 | | pass | Password | top |
+------------+-----------+------+ | ftp | FTP | top |
| log | Log | top |
| txt | Text | mid |
+------+-------------+----------+
I want to get all the rows from the "fields" table if they have the position "top" AND if a row has a match on name = fieldname from fields_data it should also show the data. This is my join
SELECT
fd.`data`,
fd.`relationid`,
fd.`fieldname`,
f.`name`,
f.`displayname`
FROM `fields` AS f
LEFT OUTER JOIN `fields_data` AS fd
ON fd.`fieldname` = f.`name`
WHERE f.`position`='top' AND (fd.`relationid`='3' OR fd.`relationid` IS NULL)
My problem is that the above query only gives me this result:
+------+------------+-----------+------+-------------+
| data | relationid | fieldname | name | displayname |
+------+------------+-----------+------+-------------+
| NULL | NULL | NULL | user | Username |
| NULL | NULL | NULL | pass | Password |
| NULL | NULL | NULL | log | Log |
+------+------------+-----------+------+-------------+
The field called "ftp" is missing due to it having a relation to "2".. However i still want to display it as result but like the others with NULL in it. And if the SQL query had "fd.relationid='2'" instead of 3 it would give same result, but with the row containing ftp in name, holding data in the three fields.
I hope you get what i mean.. My english is not the best.. Heres the result i want:
with above query containing fd.`relationid`='3'
+------+------------+-----------+------+-------------+
| data | relationid | fieldname | name | displayname |
+------+------------+-----------+------+-------------+
| NULL | NULL | NULL | user | Username |
| NULL | NULL | NULL | pass | Password |
| NULL | NULL | NULL | ftp | FTP |
| NULL | NULL | NULL | log | Log |
+------+------------+-----------+------+-------------+
with above query containing fd.`relationid`='2'
+------+------------+-----------+------+-------------+
| data | relationid | fieldname | name | displayname |
+------+------------+-----------+------+-------------+
| NULL | NULL | NULL | user | Username |
| NULL | NULL | NULL | pass | Password |
| test | 2 | ftp | ftp | FTP |
| NULL | NULL | NULL | log | Log |
+------+------------+-----------+------+-------------+

You want to move the condition to the on clause:
SELECT fd.`data`, fd.`relationid`, fd.`fieldname`, f.`name`, f.`displayname`
FROM `fields` f LEFT OUTER JOIN
`fields_data` fd
ON fd.`fieldname` = f.`name` AND fd.`relationid` = '3'
WHERE f.`position`='top' ;
It is interesting that the semantics of your query and this query are different -- and you found the exact situation: when there is a match on another value, the where clause form filters out the row. This will still keep everything.
As a note, the following also does what you want:
SELECT fd.`data`, fd.`relationid`, fd.`fieldname`, f.`name`, f.`displayname`
FROM `fields` f LEFT OUTER JOIN
(SELECT fd.*
FROM `fields_data` fd
WHERE fd.`relationid` = '3'
) fd
ON fd.`fieldname` = f.`name`
WHERE f.`position` = 'top' ;
I wouldn't recommend writing the query this way, particularly in MySQL (because the subquery is materialized). However, understanding why your version is different from these versions (and why these are the same) is a big step forward in mastering outer joins.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

how do you convert a fasta file into an sql table? - sql

Related

Postgresql query substract from one table

SQLITE: How to select a column value based on different columns in another table

Get similar employees based on their attribute values

Query from secondary index on aerospike

Joining two tables and show data from one if there is any

Categories

Resources