I'd like to run a query on a table that matches any of the following sets of conditions:
SELECT
id,
time
FROM
TABLE
WHERE
<condition1 is True> OR,
<condition2 is True> OR,
<condition3 is True> OR,
...
Each condition might look like:
id = 'id1' AND t > 20 AND t < 40
The values from each WHERE condition (id, 20, 40 above) are rows in a pandas dataframe - that is 20k rows long. I see two options that would technically work:
make 20k queries to the database, one for each condition, and concat the result
generate a (very long) query as above and submit
My question: what would be an idiomatic/performant way to accomplish this?
I suspect neither of the above are appropriate approaches and this problem is somewhat difficult to google.
I think it would be better to create a temporary table with columns id, t1, and t2 and put your 20k rows in there. Then just join to this temporary table:
SELECT DISTINCT TABLE.id, time
FROM TABLE
JOIN TEMP_TABLE T2 ON
TABLE.ID = T2.ID AND TABLE.T > T1 AND TABLE.T < T2;
Related
I hope this is not a very very obvious, but if it is please don't kill me, I'm really new at this and struggle to find much info.
This code should do the following:
-grab columns from 2 different tables
-only select rows that are not on a third table on a particular date
proc sql;
create table DiallerExtra as
SELECT a.AGREEMENT_NUMBER,
b.CURRENT_FIRST_NAME,
b.TELEPHONE_NUMBER
FROM TABLE1 a, TABLE2 b
WHERE a.AGREEMENT_NUMBER
NOT IN (SELECT AgreementNumber
FROM TABLE3 (WHERE =(DeleteDate >= today()-1)))
;
quit;
I first tried this (below) and it worked fine to filter the results (I ended up with 15 rows only).
proc sql;
create table DiallerExtra as
SELECT a.AGREEMENT_NUMBER
FROM TABLE1 a
WHERE a.AGREEMENT_NUMBER
NOT IN (SELECT AgreementNumber
FROM TABLE3 (WHERE =(DeleteDate >= today()-1)))
;
quit;
But when I tried the first code, it doesn't seem to be filtering correctly cause it spits out all the agreements on TABLE2, which is a lot.
The "filtering" of your NOT IN logic is not the problem.
Add something to tell SQL how to combine TABLE1 and TABLE2.
If you want to combine two tables with SQL you need to tell it how to match the observations. Otherwise every observation in TABLE1 is matched to every observation in TABLE2. In your first example even if there is only one observation in TABLE1 with a value of A.AGREEMENT_NUMBER that is in the TABLE3 observations that match your WHERE= dataset option then it will be matched with every observation in TABLE2. So if TABLE2 had 100 customers the result set will have 100 observations.
So add another condition to your WHERE statement. For example if both TABLE1 and TABLE2 have AGREEMENT_NUMBER then perhaps you want to match on that.
create table DiallerExtra as
SELECT a.AGREEMENT_NUMBER
, b.CURRENT_FIRST_NAME
, b.TELEPHONE_NUMBER
FROM TABLE1 a
, TABLE2 b
WHERE a.AGREEMENT_NUMBER = b.AGREEMENT_NUMBER
and a.AGREEMENT_NUMBER NOT IN
(SELECT AgreementNumber FROM TABLE3 (WHERE =(DeleteDate >= today()-1)))
;
(P.S. I am still learning SQL and you can consider me a newbie)
I have 2 sample tables as follows:
Table 1
|Profile_ID| |Img_Path|
Table 2
|Profile_ID| |UName| |Default_Title|
My scenario is, from the 2nd table, i need to fetch all the records that contain a certain word, for which i have the following query :
Select Profile_Id,UName from
Table2 Where
Contains(Default_Title, 'Test')
ORDER BY Profile_Id
OFFSET 5 ROWS
FETCH NEXT 20 ROWS ONLY
(Note that i am setting the OFFSET due to requirements.)
Now, the scenario is, as soon as i retrieve 1 record from the 2nd table, i need to fetch the record from the 1st table based on the Profile_Id.
So, i need to return the following 2 results in one single statement :
|Profile_Id| |Img_Path|
|Profile_Id| |UName|
And i need to return the results in side-by-side columns, like :
|Profile_Id| |Img_Path| |UName|
(Note i had to merge 2 Profile_Id columns into one as they both contain same data)
I am still learning SQL and i am learning about Union, Join etc. but i am a bit confused as to which way to go.
You can use join:
select t1.*, t2.UName
from table1 t1 join
(select Profile_Id, UName
from Table2
where Contains(Default_Title, 'Test')
order by Profile_Id
offset 5 rows fetch next 20 rows only
) t2
on t2.profile_id = t1.profile_id
SELECT a.Profile_Id, a.Img_Path, b.UName
FROM table1 a INNER JOIN table2 b ON a.Profile_Id=b.Profile_Id
WHERE b.Default_Title = 'Test'
Dual data entry checking. Same data is enterd by two persons and now i want to compare this to ensure data quality.
This will depend a lot the measure of quality that you want to use.
As an example, you can just check for the fraction of entries that match exactly,
CASE WHEN COLUMN1 = COLUMN2 THEN '1' ELSE '0' END AS MatchedData
Then you can sum MatchedData and divide by the total number of entries
You can use a correlated subquery for this. First you need to decide which are columns when two records have the identical value in are considered duplicate records. Like you said the records entered by different users so they may have created_by_user column (if exists) different value and all other same. then put them in below sub query below to get the list of duplicate records.
SELECT
*
FROM
MY_TABLE t1
WHERE
ROWID <> (
SELECT
MAX(ROWID)
FROM
MY_TABLE t2
WHERE
t1.col1 = t2.col1
AND
t1.col2 = t2.col2
)
I'm trying to update medical data from one table to another after switching from one system to another. We have two tables, for simplicity I'll make this a simple example. There are many columns in these tables in reality (not just 5).
Table1:
name, date, var1, var2, var3
Table2:
name, date, var1a, var2a, var3a
I want to transfer data from Table 1 to Table 2 for any rows where there isn't previous data for that date, where var1 = var1a, etc (same columns with different names).
I was trying to do something with a loop, but realized that may not be necessary.
I had gotten this far but keep wasn't sure if this was ok:
UPDATE Table2 VALUES (date, var1a, var2a, var3a)
SELECT date, var1, var2, var3 FROM Table1
Is that correct syntax so far? Or do I need to map the variables to translate var1 into var1a, etc?
How do I add a check to make sure I don't overwrite any data already in Table1? I don't want to add data if there is already data for that date/name combination.
Thanks!
You can INSERT into TABLE2 all values from TABLE1 that do not already exist in Table2:
INSERT INTO Table2 (date, var1a, var2a, var3a)
SELECT date, var1, var2, var3
FROM Table1 t1
WHERE NOT EXISTS (SELECT 1 FROM Table2 t2 WHERE t2.date = t1.date)
Already existing values are specified by comparing the date column. You can add any other predicates in the SELECT subquery of the NOT EXISTS expression to suit your needs.
You could use an update with a join. And you dont need to update the date column since that's what you are using to find the matches in the 2 tables.
Either you generate a dynamic query based on the empty/null valued columns, or you could do something like the below, which puts the same value in the column if it exists in table2 or else puts the corresponding value from table1.
The below approach requires less logic and easier to implement but will produce IO equivalent to updating the entire table.
update tbl2
set val1a=isnull(val1a,val1)
, val2a=isnull(val2a,val2)
, val3a=isnull(val3a.val3)
from table1 tbl1
inner join table2 tbl2
on tbl1.name=tbl2.name
and tbl1.date=tbl2.date
Considerations:
The approach requires less logic and easier to implement but will produce IOs equivalent to updating the entire table2. If you have a smallish table i would go with this approach.
If its a big table then you should look into building specific query sets to reduce IO
This code is tested in Access but something very similar should work in SQL Server 2012:
UPDATE Table2 RIGHT JOIN Table1 ON Table2.date = Table1.date
SET Table2.name = Table1.name, Table2.date = Table1.date, Table2.var1 = Table1.var1a, Table2.var2 = Table1.var2a, Table2.var3 = Table1.var3a
WHERE (Table2.date Is Null);
Explanation: this uses a right join so that within the query all the data from Table1 is present and where there is a matched date for Table2 that data is present too. We then ignore all cases where there is any data for Table2 and update the query in all other cases - the update in fact inserts new data into Table2.
I have two tables in my SQLite Database (dummy names):
Table 1: FileID F_Property1 F_Property2 ...
Table 2: PointID ForeignKey(fileid) P_Property1 P_Property2 ...
The entries in Table2 all have a foreign key column that references an entry in Table1.
I now would like to select entries from Table2 where for example F_Property1 of the referenced file in Table1 has a specific value.
I tried something naive:
select * from Table2 where fileid=(select FileID from Table1 where F_Property1 > 1)
Now this actually works..kind of. It selects a correct file id from Table1 and returns entries from Table2 with this ID. But it only uses the first returned ID. What I need it to do is basically connect the returned IDs from the inner select by OR so it returns data for all the IDs.
How can I do this? I think it is some kind of cross-table-query like what is asked here What is the proper syntax for a cross-table SQL query? but these answers contain no explaination of what they are actually doing so I'm struggeling with any implementation.
They are using JOIN statements, but wouldn't this mix entries from Table1 and Table2 together while only checking matching IDs in both tables? At least that is how I understand this http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL-Joins
As you may have noticed from the style, I'm very new to using databases in general, so please forgive me if not everything is clear about what I want. Please leave a comment and I will try to improve the question if neccessary.
The = operator compares a single value against another, so it is assumed that the subquery returns only a single row.
To check whether a (column) value is in a set of values, use IN:
SELECT *
FROM Table2
WHERE fileid IN (SELECT FileID
FROM Table1
WHERE F_Property1 > 1)
The way joins work is not by "mixing" the data, but sort of combining them based on the key.
In your case (I am assuming the key field in Table 1 is unique), if you join those two tables on the primary key field, you will end up with all the entries in table2 plus all corresponding fields from table1. If you were doing this:
select * from table1, table2 where table1.fieldID=table2.foreignkey;
then, providing your key fields are set up right, you will end up with the following:
PointID ForeignKey(fileid) P_Property1 P_Property2 FileID F_Property1 F_Property2
The field values from table1 would be from matching rows.
Now, if you do this:
select table1.* from table 1, table2 where
table1.fieldID=table2.foreignkey and F_Property1>1;
Would essentially get the same set of records, but will only show the columns from the second table, and only those that satisfy the where condition for the first one.
Hope this helps :)
If I understood your question correctly this will get the job done.
Select t2.*
from table1 t1
inner join table2 t2 on t2.id = t1.id
where t1.Prop = 'SomeValue'