I have a table that looks like this
ID
Steps
Letters
1
1
a
1
2
e
1
3
b
2
1
c
2
2
d
3
1
b
3
2
a
And a query that consists of the output
a
b
d
My goal is to create a table/ modify the first one to get rid of the letter column, and instead, have N additional columns (where N is the number of rows in the second query above) and the output is 1 if the last step for that ID was that specific letter, 0 if that letter was in any step, and NULL if it never was. Making a table like this
ID
a
b
d
1
0
1
NULL
2
NULL
NULL
1
3
1
0
NULL
I assume pivoting makes sense as a way to approach it, but I don't even know where to begin
Related
A temp table has 700+ records with a PK. 12 columns contain Id values from lookup tables. Each lookup table has 4-8 records in it. How can I get a record count for each Id value in table LookupA that has a relationship via the PK to Id values in every other lookup table? Each lookup value in each lookup table needs to compared for a record count to every other lookup table and value.
I can write a SQL statement to get specific values for specific columns, but that's a long exercise and will slow down the proc.
Here's a sample of the data.
PK LookupA LookupB LookupC
1 1 1 3
2 1 2 3
3 1 3 2
4 2 4 2
5 4 1 1
6 3 2 1
7 2 3 3
8 4 4 3
9 4 3 2
10 1 1 2
The results need to compare LookupA with LookupB and LookupC to get a row count.
Table Value LookupB 1 2 3 4 LookupC 1 2 3
LookupA 1 2 1 1 0 0 2 2
2 0 0 1 1 0 1 1
3 0 1 0 0 1 0 0
4 1 0 1 1 1 1 1
Then LookupB would be compared to LookupA and LookupC.
And LookupC would be compared to LookupA and LookupB.
With this code you can get the numbers for all combinations of A,B and C in pairs:
select 'A-B' as Combination, LookupA, LookupB, count(*) as NumRecords
from table
group by Combination,LookupA, LookupB
UNION
select 'A-C' as Combination, LookupA, LookupC, count(*) as NumRecords
from table
group by Combination,LookupA, LookupC
UNION
select 'B-C' as Combination, LookupB, LookupC, count(*) as NumRecords
from table
group by Combination,LookupB, LookupC
After this, if you want to see all the values for LookupA comparing to B and C just
look for Combinations A-B and A-C
If I understand correctly, your temp table contains foreign keys to other tables, so why not simply use joins? Something like this.
SELECT COUNT(DISTINCT lookupA.id) as CountA
, COUNT(DISTINCT lookupB.id) as CountB
, etc...
FROM #temp_table t
LEFT OUTER JOIN lookupA a on a.id = t.lookupA
LEFT OUTER JOIN lookupB b on b.id = t.lookupB
...etc
I would suggest reviewing the design if possible. Having so many small tables complicates things, is it not possible to consolidate this and just have one lookup table? You could have an additional field "LookupType" and all the lookups could be in the same place which would make retrieval much simpler.
I used a slight derivative of the statement below without any UNIONs to get me where I wanted to go.
/*
select 'A-B' as Combination, LookupA, LookupB, count(*) as NumRecords
from table
group by Combination, LookupA, LookupB
*/
I used a variable and a WHILE loop to place the various summaries where they need to be.
Lets say I have the following table
a b c
-----------
1 1 5
1 2 3
4 1 2
1 2 4
4 2 10
And I want to delete all rows where none of the first n rows has the same value in a and b as that row.
So for example the resulting tables for various n's would be
n = 1
a b c
-----------
1 1 5
// No row other than the first has a 1 in a, and a 1 in b
n = 2
a b c
-----------
1 1 5
1 2 3
1 2 4
// The fourth row has the same values in a and b as the second, so it is not deleted. The first 2 rows of course match themselves so are not deleted
n = 3
a b c
-----------
1 1 5
1 2 3
4 1 2
1 2 4
// The fourth row has the same values in a and b as the second, so it is not deleted. The first 3 rows of course match themselves so are not deleted
n = 4
a b c
-----------
1 1 5
1 2 3
4 1 2
1 2 4
// The first 4 rows of course match themselves so are not deleted. The fifth row does not have the same value in both a and b as any of the first 4 rows, so is deleted.
I've been trying to work out how to do this using a not in or a not exists, but since I'm interested in two columns matching not just 1 or the whole record, I'm struggling.
Since you are not defining a specific order, the result is not completely defined, but depends on arbitrary choices of implementation regarding which rows are computed first in the limit clause. A different SQLite version for example may give you a different result. With that being said, I believe that you want the following query:
select t1.* from table1 t1,
(select distinct t2.a, t2.b from table1 t2 limit N) tabledist
where t1.a=tabledist.a and t1.b=tabledist.b;
where you should replace N with the desired number of rows
EDIT: So, to delete directly from the existing table you need something like:
with toremove(a, b, c) as
(select * from table1 tt
EXCEPT select t1.* from table1 t1,
(select distinct t2.a, t2.b from table1 t2 limit N) tabledist
where t1.a=tabledist.a and t1.b=tabledist.b)
delete from table1 where exists
(select * from toremove
where table1.a=toremove.a and table1.b=toremove.b and table1.c=toremove.c);
I have a simple table with 2 columns: ID (integer) and Category (string), and each ID can repeat with a few categories, like so:
ID Cat
--- ---
1 A
1 B
2 B
3 A
3 B
3 C
I want to reshape this table so that each unique Category would be a dummy variable (0/1 if ID has it):
ID A B C
--- -- -- --
1 1 1 0
2 0 1 0
3 1 1 1
Now, if the set of unique categories is known (and small) this is an easy CASE WHEN statement x no. of unique categories.
My questions are:
a) What if it isn't unknown or really large? How do I create this 'CASE WHEN' effect automatically?
b) More importantly: I'm not necessarily interested in all categories (say only dummies for 'A' and 'B') but only categories which I have in a separate table called Cats, which is a simple 1 column holding my relevant categories (again, unknown how many), like:
Cat
---
A
B
How do I create dummy variables for only the categories in this dynamic table?
Do you think all of this should really be done in other tools e.g. R?
Thanks!
(I'm using Teradata SQL with SQLA, but I think it's a general SQL question)
Just use table:
table(dat)
Cat
ID A B C
1 1 1 0
2 0 1 0
3 1 1 1
and in case you want to have the binary table for a group of Cat:
table(subset(dat,Cat %in% c('A','B')))
Cat
ID A B
1 1 1
2 0 1
3 1 1
I don't know what in the world is the best way to go about this. I have a very large array of columns, each one with 1-25 rows associated with it. I need to be able to combine all into one large column, skipping blanks if at all possible. Is this something that Access can do?
a b c d e f g h
3 0 1 1 1 1 1 5
3 5 6 8 8 3 5
1 1 2 2 1 5
4 4 2 1 1 5
1 5
there are no blanks within each column, but each column has a different number of numbers in it. they need to be added from left to right so a,b, c, d, e, f. And the 0 from be needs to be in the first blank cell after the second 3 in A. And the first 5 in H needs to be directly after the 1 in g, with no blanks.
So you want a result like:
3
3
0
5
1
4
1
6
1
4
etc?
Here is how I would approach the problem. Insert your array into a work table with an autonumber column (important to retain the order the data is in, databases do not guarnatee an order unless you can give them something to sort on) called id
as well as the array columns.
Create a final table with an autonumber column (see above note on why you need an automnumber) and the column you want as you final table.
Run a separate insert statment for each column in your work table and run them in the order you want the data.
so the inserts would look something like:
insert table2 (colA)
select columnA from table1 order by id
insert table2 (colA)
select columnB from table1 order by id
insert table2 (colA)
select columnC from table1 order by id
Now when you do select columnA from table2 order by id you should have the results you need.
I have two intermediate result sets in a create view statement. The result sets are derived from two different join paths and I need to union them. But it doesn't stop here. Since the ID column needs to be unique, I will then need the rows in result set 2 that contains the same IDs as the first result set to overwrite the same rows in the first result set.
Let me illustrate this here:
Result set 1
ID Value
------------
1 a
3 a
5 a
6 a
7 a
8 a
Result Set 2
ID Value
------------
2 b
4 b
5 b
7 b
9 b
10 b
End result set
ID value
------------
1 a
2 b
3 a
4 b
5 b
6 a
7 b
8 a
9 b
10 b
I am not sure how to approach this. Union/except/intersect will create duplicate ids, so that's no good.
SELECT COALESCE(set2.ID, set1.ID) AS ID,
CASE WHEN set2.ID IS NULL THEN set1.Value ELSE set2.Value END AS Value
FROM set1
FULL JOIN set2
ON set1.ID = set2.ID
Try deleting elements from result set 1 where id exists in result set 2 before union all.