How to select data in same table on different line that are empty? - sql

I have these three tables (I attach a preview). And of the end of list is example of data in table “virustotalscans.” There is column with name “virustotal.“ The each unique sample has number, for example 165, next sample has number 166 and etc.
TABLE VIRUTOTALS
CREATE TABLE virustotals (
virustotal INTEGER PRIMARY KEY,
virustotal_md5_hash TEXT NOT NULL,
virustotal_timestamp INTEGER NOT NULL,
virustotal_permalink TEXT NOT NULL
);
CREATE INDEX virustotals_md5_hash_idx
ON virustotals (virustotal_md5_hash);
TABLE VIRUSTOTALSCANS
CREATE TABLE virustotalscans (
virustotalscan INTEGER PRIMARY KEY,
virustotal INTEGER NOT NULL,
virustotalscan_scanner TEXT NOT NULL,
virustotalscan_result TEXT
);
CREATE INDEX virustotalscans_result_idx
ON virustotalscans (virustotalscan_result);
CREATE INDEX virustotalscans_scanner_idx
ON virustotalscans (virustotalscan_scanner);
CREATE INDEX virustotalscans_virustotal_idx
ON virustotalscans (virustotal);
TABLE DOWNLOADS
CREATE TABLE downloads (
download INTEGER PRIMARY KEY,
connection INTEGER,
download_url TEXT,
download_md5_hash TEXT
-- CONSTRAINT downloads_connection_fkey FOREIGN KEY (connection) REFERENCES connections (connection)
);
CREATE INDEX downloads_connection_idx ON downloads (connection);
CREATE INDEX downloads_md5_hash_idx
ON downloads (download_md5_hash);
CREATE INDEX downloads_url_idx
ON downloads (download_url);
Example of data in table “virustotalscans”: http://pastebin.com/7E7McZwT
Now, I need select all samples, which are on all lines in column “virustotalscan_result” empty. So I need select all samples, which don´t detect VirusTotal with any antivirus. I tried this select:
select distinct downloads.download_md5_hash from virustotalscans, virustotals,
downloads
where downloads.download_md5_hash = virustotals.virustotal_md5_hash and
virustotals.virustotal = virustotalscans.virustotal and
virustotalscans.virustotalscan_result IS NULL;
but I get MD5 hashes of all samples... Probably reason is that all samples contain at least one line, which is empty. It is logical because, some antivirus always doesn’t detect some sample.
The better example: http://pastebin.com/y81DPpmQ. Now I need select sample - number (column virustotal), where are all lines empty in column virustotalscan_result. It can be for example only number 2.
Can you help me please?
Thank you very much for replies.

SELECT download_md5_hash
FROM downloads
JOIN virustotals ON download_md5_hash = virustotal_md5_hash
WHERE virustotal IN (SELECT virustotal
FROM virustotalscans
GROUP BY virustotal
HAVING COUNT(virustotalscan_result) = 0)

Related

How to insert a new row into the table has unique id autoincrement?

After creating the table with a unique ID autoincrement, I realize my table lack a row. But I don't know how to do this without compromising the order of other rows in the table!
TABLE flights
id INTEGER PRIMARY KEY AUTOINCREMENT,
origin TEXT NOT NULL,
destination TEXT NOT NULL,
duration INTEGER NOT NULL
I want to insert a row: 2|Shanghai|Paris|760 into the table with id = 2.
1|New York|London|415
2|Istanbul|Tokyo|700
3|New York|Paris|435
4|Moscow|Paris|245
5|Lima|New York|455
Table I wished:
1|New York|London|415
2|Shanghai|Paris|760
3|Istanbul|Tokyo|700
4|New York|Paris|435
5|Moscow|Paris|245
6|Lima|New York|455
Thanks for any advice to me!
No way you can do this with auto-increment ID because IDS are not to order rows, but to identify the rows and assert it's the only row with that ID. If you want to, use a new specific column for this purpose, this way the IDs still the same and you can sort using anything as indexes.
CREATE TABLE flights (
id INTEGER AUTO_INCREMENT,
index INTEGER NOT NULL,
origin TEXT NOT NULL,
destination TEXT NOT NULL,
duration INTEGER NOT NULL,
PRIMARY KEY (id),
UNIQUE KEY unique_index (index)
);

How to make human readable autoincrement column in PostgreSQL?

I need to make the column for store serial number of orders in the online shop.
Currently, I have this one
CREATE TABLE public.orders
(
id SERIAL PRIMARY KEY NOT NULL,
title VARCHAR(100) NOT NULL
);
CREATE UNIQUE INDEX orders_id_uindex ON public.orders (id);
But I need to create the special alphanumeric format for storing this number
like this 5CC806CF751A2.
How can I create this format with Postgres capabilities?
You can create a view that simply converts the ID to a hex value:
create view readable_orders
as
select id,
to_hex(id) as readable_id,
title
from orders;

ORA-00904: "NO_OF_PROJ_PER_CON_PY": invalid identifier

I am trying to create a fact table which will display the number of projects per consultant per year. It has 2 dimension tables 1 for time (report_time_dim) and the other for consultants(consultant_dim) then the main fact table (fact_table).
CREATE TABLE fact_table(
fact_key INTEGER NOT NULL,
consultant_key INTEGER NOT NULL,
time_key INTEGER NOT NULL,
no_of_projects_py INTEGER,
no_of_consultants_py INTEGER,
no_of_accounts_py INTEGER,
no_of_proj_per_con_py INTEGER,
fk1_time_key INTEGER NOT NULL,
fk2_consultant_key INTEGER NOT NULL,
-- Specify the PRIMARY KEY constraint for table "fact_table".
-- This indicates which attribute(s) uniquely identify each row of data.
CONSTRAINT pk_fact_table PRIMARY KEY (consultant_key,time_key)
);
CREATE TABLE report_time_dim(
time_key INTEGER NOT NULL,
year INTEGER,
-- Specify the PRIMARY KEY constraint for table "time_dim".
-- This indicates which attribute(s) uniquely identify each row of data.
CONSTRAINT pk_report_time_dim PRIMARY KEY (time_key)
);
CREATE TABLE consultant_dim(
consultant_key INTEGER NOT NULL,
project_id INTEGER,
consultant_id INTEGER,
-- Specify the PRIMARY KEY constraint for table "consultant_dim".
-- This indicates which attribute(s) uniquely identify each row of data.
CONSTRAINT pk_consultant_dim PRIMARY KEY (consultant_key)
);
Each table has it's own surrogate key and I have managed to populate the time and consultant tables successfully, however the issue I'm having is with the fact table. When I try to populate it I get the error ORA-00904: "NO_OF_PROJ_PER_CON_PY": invalid identifier. I am unsure how I can go about fixing this and populating the fact table so it will display the information I want. Any help would be appreciated.
--populate fact_table
--table that lists consultant ids, project ids and years
DROP TABLE temp_fact1;
CREATE TABLE temp_fact1 AS
SELECT project_id, fk2_consultant_id, to_number(to_char(lds_project.pj_actual_start_date, 'YYYY')) as which_year FROM lds_project;
--display table
SELECT * FROM temp_fact1;
--list that counts the number of projects for each consultant and specify the year
DROP TABLE temp_fact2;
CREATE TABLE temp_fact2 AS
SELECT which_year, fk2_consultant_id, COUNT(*) project_id FROM temp_fact1 GROUP by fk2_consultant_id, which_year;
--display table
SELECT * FROM temp_fact2;
--fact table surrogate key
DROP SEQUENCE fact_seq;
CREATE SEQUENCE fact_seq
START WITH 1
INCREMENT BY 1
MAXVALUE 1000000
MINVALUE 1
NOCACHE
NOCYCLE;
--load data
INSERT INTO fact_table (fact_key, consultant_key, time_key, no_of_proj_per_con_py)
SELECT fact_seq.nextval, consultant_key, report_time_dim.time_key, no_of_proj_per_con_py FROM temp_fact2, report_time_dim WHERE temp_fact2.which_year = report_time_dim.year;
Try just running this select by itself - it's the last line in your script.
SELECT fact_seq.nextval,
consultant_key,
report_time_dim.time_key,
no_of_proj_per_con_py
FROM temp_fact2, report_time_dim
WHERE temp_fact2.which_year = report_time_dim.year;
It doesn't look like either TEMP_FACT2 or REPORT_TIME_DIM has a column named no_of_proj_per_con_py. I'm not sure where you want to pull that data from, actually.

No row selected

SQL> create table artwork
2 (
artwork_id number(7) NOT NULL,
barcode char(20),
title char(20),
description char(50),
PRIMARY KEY (artwork_id)
);
Table created.
SQL> select * from artwork;
no rows selected
I created the table but it showing me this error dont know. Why table it not showing?
I would expect the create to look like this:
create table artwork (
artwork_id number primary key,
barcode char(20),
title varchar2(20),
description varchar2(50)
);
Notes:
There is no need to have number(7). You can specify the length, but it is not necessary.
For title and description you definitely want varchar2(). There is no reason to store trailing spaces at the end of a name.
That may be the same for barcode, but because it might always be exactly 20 characters or trailing spaces might be significant, you might leave it as char().
The primary key constraint can be expressed in-line. There is no need for a separate declaration.
You probably simply want something like
create table artwork
(
artwork_id number(7) NOT NULL,
barcode varchar2(20),
title varchar2(20),
description varchar2(50),
PRIMARY KEY (artwork_id)
);
insert into artwork values (0, 'barcode', 'fancytitle', 'somedescription');
insert into artwork values (1, 'barcode1', 'fancytitle1', 'somedescription1');
select * from artwork;
This creates a table "ARTWORK", inserts 2 rows in it and then selects all rows currently in the table.
An empty table contains no data, with the create table-statement you only define the bucket of data, you have to fill the bucket as well with items.
I'd also recommend a auto increment column (oracle 12c) or a trigger/sequence to increment the id automatically. But that's something to read later :)

SQLITE3: find IDs across multiple tables

I would like to do analysis of what codes appear in multiple tables under certains conditions. However I don't think the database schema suits the task very well but maybe there's something I don't know about that can help me. Here's a simplified schema:
CREATE TABLE "batchDescription" (
id INTEGER NOT NULL,
name TEXT NOT NULL UNIQUE,
PRIMARY KEY (id)
);
CREATE TABLE "simulationDetails" (
id INTEGER NOT NULL,
ko_index_id INTEGER NOT NULL,
batch_description_id INTEGER NOT NULL,
data1 REAL NOT NULL,
data2 INTEGER NOT NULL,
PRIMARY KEY (id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
FOREIGN KEY(batch_description_id) REFERENCES "batchDescription" (id)
);
CREATE TABLE "koIndex" (
id INTEGER NOT NULL,
number_of_kos INTEGER NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE "1kos" (
ko_index_id INTEGER NOT NULL,
ko1 INTEGER NOT NULL,
PRIMARY KEY (ko_index_id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
);
CREATE TABLE "2kos" (
ko_index_id INTEGER NOT NULL,
ko1 INTEGER NOT NULL,
ko2 INTEGER NOT NULL,
PRIMARY KEY (ko_index_id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
);
CREATE TABLE "3kos" (
ko_index_id INTEGER NOT NULL,
ko1 INTEGER NOT NULL,
ko2 INTEGER NOT NULL,
ko3 INTEGER NOT NULL,
PRIMARY KEY (ko_index_id)
FOREIGN KEY(ko_index_id) REFERENCES "koIndex" (id)
);
This goes up to table "525kos" which has ko1 to ko525 in it - ko1 to ko525 are IDs that are primary keys in a table not shown here. I want to do an analysis of how often certain IDs are present under certain conditions. Here is a simple example to illustrate:
I would like to like to count the amount of times a certain ID (let's say 127) (in any koX column) in the "13kos" table occurs when simulationDetails.data1 not equal to 0. I would do this on a database called ko.db from the bash command line like:
for ko_idx in {1..13}; do sqlite3 ko.db "select count(ko${ko_idx}) from '13kos' where ko${ko_idx} = 127 and ko_index_id in (select ko_index_id from simulationDetails where data1 != 0);"; done
Already this is slow and inefficient but is simple compared to what I would like to do. What if I wanted to do an analysis of all the IDs in all possible columns in all "Xkos" tables and compare them to where data1 is equal and not equal to zero?
Can anybody direct me to a better way of doing this or is the schema design just not very good for this kind of analysis and I'll have to give up?
EDIT: Thought I'd add a bit of extra detailto avoid confusion. I suspect that a good way to achieve want I want would be to somehow combine all the "Xkos" tables into one temporary table and then search for certain IDs from that table. How would I combine all 525 ko tables without writing out each table name?
How would I combine all 525 ko tables without writing out each table
name?
Create a table with the same number of columns as the largest table (the table into which you merge) allowing nulls.
query the sqlite_master table using something like :-
SELECT * from sqlite_master WHERE name LIKE '%kos%' AND type = 'table'
Loop through the extracted table names building an INSERT SELECT for each table that will insert the rows from the tables into the table created in 1.
See 2. INSERT INTO table SELECT ...; especially in regard to handling missing columns.
All done, the table created in 1 will be populated accordingly.