SQL: Populating Column B where Column A has a match elsewhere in Column B - sql

I’m somewhat of a newbie to SQL queries, especially anything containing logic, and although I've searched for hours finding the exact terms to search for is not easy in this case! I have a relatively simple one, I’m sure:
A table has 2 columns, and each row contains data about a function in a program. Some functions have a parent function associated (for grouping). Column A is the unique function ID. Column B indicates, when applicable, the parent function’s ID. All parent function IDs are independent and valid function IDs that exist elsewhere in column A.
For reporting purposes I need to list the functions grouped by their parent ID, listing the parent function with the child functions. I can easily report by parent function ID, but the problem is that a parent function does not know that it is a parent function because its column B is empty!
What I need to do is complete the value in Column B if it is empty and the function is referenced elsewhere as a parent function.
Otherwise stated, for each row that has a null value in Column B:
Take the value from column A
Check for the existence of that value in ANY row on column B
If there is a match, inject the value into column B (so that Column A and B have the same value)
What I have: (Query: SELECT function_id, parent_function FROM functions)
FUNCTION_ID PARENT_FUNCTION
4
13 4
79
138 4
195
314 345
345
What I need to have:
FUNCTION_ID PARENT_FUNCTION
4 4
13 4
79
138 4
195
314 345
345 345
Any Ideas? I can't wait to get more familiar with SQL! Thanks ahead of time.

This should work for you:
UPDATE functions
SET parent_function = function_id
WHERE parent_function IS NULL
AND function_id IN (SELECT parent_function FROM functions)
This will set parent_function equal to function_id where it has not yet been set, and where it appears somewhere in the parent_function column.
If you don't actually want to modify the table data but still return values that you need, you can use similar logic like this:
SELECT f.function_id, COALESCE(f.parent_function, f2.function_id) as parent_function
FROM functions f
LEFT JOIN functions f2
ON f.function_id = f2.function_id
AND f2.function_id IN (SELECT parent_function FROM functions)

maybe you can compare the two table using EXCEPT or INTERSECT
http://msdn.microsoft.com/en-us/library/ms188055.aspx
more tutorials>:
http://www.mssqltips.com/sqlservertip/1327/compare-sql-server-datasets-with-intersect-and-except/

How's this look?
select distinct
t1.funx, t1.parent,
case when t2.parent is null then t1.parent
else t2.parent end as newparent
from
tbl t1 left outer join
tbl t2 on
t1.funx = t2.parent
sqlFiddle

Related

Pattern Matching or Fuzzy Matching of two tables based on one column

Assuming I have the right naming, what O am trying to write is a function or stored procedure to compare names and find out if they are the same value.
I think its called fuzzy matching
For example, a table has 2 columns and table b has 3 columns:
Name
Number
Hello
24
Evening
56
Name
Num
F
Heello
23
some value
GoodEvening
15
some value
I want table like
A
D
Hello
Heello
Morning
GoodMorning
Currently, I'm using
Select A.Name, B.Name
from table A
left table B
on A.Name like B.Name
or (LTRIM(RTRIM(REPLACE(REPLACE(REPLACE( A.Name,' ',''),'-',''),'''',''))) = LTRIM(RTRIM(REPLACE(REPLACE(REPLACE(B.Name,' ',''),'-',''),'''',''))))
OR (A.Name LIKE '%'+B.Name+'%')
OR (B.Name LIKE '%'+A.Name+'%')
It is giving me a result, but not too accurate and is very slow, any other way I could try to compare these values?

SQLite: Matching a column containig a single string to another column containing comma-separated values [duplicate]

I have a table with a column that has concatenated values like this
Table CHILD:
ChildId Values
2 x123,j455
3 f456,z789
4 m333,y567
5 x123,h888
And I have a master table MASTER that has
Table MASTER:
MainValues
x123
f456
y567
I need to get a query that'll select the following data
ChildId MainValues
2 x123
3 f456
4 y567
5 x123
Basically match value from MASTER in child values and return only the master value. How can I do this ? I have tried IN and LIKE clause matching with second table but that doesnt help much since the values are csv. Is there a way to split and match in sqlite ?
EDIT: Table and column names are fictional and intended just to explain this question better
Use a regular expression:
SELECT ChildId,MainValues FROM CHILD INNER JOIN MASTER WHERE ','||[Values]||',' like '%,'||MainValues||',%'
Also, please refrain from using keywords like values for column names...
Unfortunately SQLite doesn't have a function to find the index of a character from a string. So you have to rely on something else. Idan's method is good too but can be slower. You may try this:
SELECT c.childID, m.mainvalues
FROM CHILD c
JOIN MASTER m
WHERE m.mainvalues = substr(c.ivalues, -length(c.ivalues), 4)
OR m.mainvalues = substr(c.ivalues, 6);
I have used 4 and 6 assuming your number of characters before and after the ,. If that's not fixed you can try:
SELECT c.childID, m.mainvalues
FROM CHILD c
JOIN MASTER m
WHERE m.mainvalues = substr(c.ivalues, -length(c.ivalues), length(m.mainvalues))
OR m.mainvalues = substr(c.ivalues, length(m.mainvalues) + 2);

Concatenating codes to obtain sum

I've been for tha past 2 days trying to solve this problem but can't even seem to find the right terms to google it.
I have 3 tables.
This one, with client codes that changed:
ActualCode=111111111 PreviousCode=44444444
And these two tables with value 1 and value 2:
PreviousCode=11111111, Value1= 50,00, Value2= 0,00
ActualCode=44444444 , Value1= 0,00, Value2 = 50,00
I need to sum the values for each relation of Previous and Actual codes from the first table.
I.E.
For
ActualCode=11111111, PreviousCode=44444444
I need to be able to get:
Code=11111111 Value1=50,00 Value2=50,00
Looking forward for your answer :D
Thanks,
P
You can join the tables and sum the values:
select c.actualcode,
sum(ac.value1) + sum(pc.value1) as value1,
sum(ac.value2) + sum(pc.value2) as value2
from codes c
join actualcodes ac on c.actualcode = ac.actualcode
join previouscodes pc on c.previouscode = pc.previouscode
group by c.actualcode;
Rextester Demo
If you could have values in the main table that don't have corresponding rows in the values tables, then you should use outer joins instead.

SQL Case with calculation on 2 columns

I have a value table and I need to write a case statement that touches 2 columns: Below is the example
Type State Min Max Value
A TX 2 15 100
A TX 16 30 200
A TX 31+ 500
Let say I have another table that has the following
Type State Weight Value
A TX 14 ?
So when I join the table , I need a case statement that looks at weight from table 2 , type and state - compare it to the table 1 , know that the weight falls between 2 and 15 from row 1 and update Value in table 2 with 100
Is this doable ?
Thanks
It returns 0 if there aren't rows in this range of values.
select Type, State, Weight,
(select coalesce(Value, 0)
from table_b
where table_b.Type = table_a.Type
and table_b.State = table_a.State
and table_a.Value between table_b.Min and table_b.Max) as Value
from table_a
For an Alteryx solution: (1) run both tables into a Join tool, joining on Type and State; (2) Send the output to a Filter tool where you force Weight to be between Min and Max; (3) Send that output to a Select tool, where you grab only the specific columns you want; (since the Join will give you all columns from all tables). Done.
Caveats: the data running from Join to Filter could be large, since you are joining every Type/State combination in the Lookup table to the other table. Depending on the size of your datasets, that might be cumbersome. Alteryx is very fast though, and at least we're limiting on State and Type, so if your datasets aren't too large, this simple solution will work fine.
With larger data, try to do it as part of your original select, utilizing one of the other solutions given here for your SQL query.
Considering that Min and Max columns in first table are of Integer type
You need to use INNER JOIN on ranges
SELECT *
FROM another_table a
JOIN first_table b
ON a.type = b.type
AND a.State = b.State
AND a.Weight BETWEEN b.min AND b.max

Difference in NA/NULL treatment using dplyr::left_join (R lang) vs. SQL LEFT JOIN

I want to left join two dataframes, where there might be NAs in the join column on both side (i.e. both code columns)
a <- data.frame(code=c(1,2,NA))
b <- data.frame(code=c(1,2,NA, NA), name=LETTERS[1:4])
Using dplyr, we get:
left_join(a, b, by="code")
code name
1 1 A
2 2 B
3 NA C
4 NA D
Using SQL, we get:
CREATE TABLE a (code INT);
INSERT INTO a VALUES (1),(2),(NULL);
CREATE TABLE b (code INT, name VARCHAR);
INSERT INTO b VALUES (1, 'A'),(2, 'B'),(NULL, 'C'), (NULL, 'D');
SELECT * FROM a LEFT JOIN b USING (code);
It seems that dplyr joins do not treat NAs like SQL NULL values.
Is there a way to get dplyr to behave in the same way as SQL?
What is rationale behind this type of NA treatment?
PS. Of course, I could remove NAs first to get there left_join(a, na.omit(b), by="code"), but that is not my question.
In SQL, "null" matches nothing, because SQL has no information on what it should join to -- hence the resulting "null"s in your joined data set, just as it would appear if performing left outer joins without a match in the right data set.
In R however, the default behaviour for "NA" when it comes to joins is almost to treat it like a data point (e.g. a null operator), so "NA" would match "NA". For example,
> match(NA, NA)
[1] 1
One way you can circumvent this would be to use the base merge method,
> merge(a, b, by="code", all.x=TRUE, incomparables=NA)
code name
1 1 A
2 2 B
3 NA <NA>
The "incomparables" parameter here allows you to define values that cannot be matched, and essentially forces R to treat "NA" the way SQL treats "null". It doesn't look like the incomparables feature is implemented in left_join, but it may simply be named differently.
By default column code have primary key,therefore not accept NULL value