Joining/Querying Tables by a Substring - sql

Currently I have a query which is partly based on a join on two tables according to two number columns within them.
Say one table has a number like 123456789999 (NUM1)
And the other table has a number ranging from 1 - 9999 (NUM2)
I want to pull out the records which have 'NUM2' within the 5th - 8th digits of 'NUM1'
Currently I am doing something like this,
FROM Table1 AS T INNER JOIN Table2 AS S
ON SUBSTRING(T.num1, 5, 4) = S.num2
I know it should be retrieving approx 100 records, but I only get 8. I believe it to be because of the small ranges within number two. Where have I gone wrong? OR how could my code be made more robust/effective?

You need to use CAST like this:
FROM Table1 AS T INNER JOIN Table2 AS S
ON CAST(SUBSTRING(T.num1, 5, 4) AS INT) = S.num2
SEE THIS FIDDLE
For more info see SQL SERVER – Convert Text to Numbers (Integer) – CAST and CONVERT

try this:
Since the datatype of NUM2 is int, 0001 will be considered as just 1
so try this:
FROM Table1 AS T INNER JOIN Table2 AS S
ON cast(SUBSTRING(T.num1, 5, 4) as int) = S.num2

Related

How to write a SQL query to calculate percentages based on values across different tables?

Suppose I have a database containing two tables, similar to below:
Table 1:
tweet_id tweet
1 Scrap the election results
2 The election was great!
3 Great stuff
Table 2:
politician tweet_id
TRUE 1
FALSE 2
FALSE 3
I'm trying to write a SQL query which returns the percentage of tweets that contain the word 'election' broken down by whether they were a politician or not.
So for instance here, the first 2 tweets in Table 1 contain the word election. By looking at Table 2, you can see that tweet_id 1 was written by a politician, whereas tweet_id 2 was written by a non-politician.
Hence, the result of the SQL query should return 50% for politicians and 50% for non-politicians (i.e. two tweets contained the word 'election', one by a politician and one by a non-politician).
Any ideas how to write this in SQL?
You could do this by creating one subquery to return all election tweets, and one subquery to return all election tweets by politicians, then join.
Here is a sample. Note that you may need to cast the totals to decimals before dividing (depending on which SQL provider you are working in).
select
politician_tweets.total / election_tweets.total
from
(
select
count(tweet) as total
from
table_1
join table_2 on table_1.tweet_id = table_2.tweet_id
where
tweet like '%election%'
) election_tweets
join
(
select
count(tweet) as total
from
table_1
join table_2 on table_1.tweet_id = table_2.tweet_id
where
tweet like '%election%' and
politician = 1
) politician_tweets
on 1 = 1
You can use aggregation like this:
select t2.politician, avg( case when t.tweet like '%election%' then 1.0 else 0 end) as election_ratio
from tweets t join
table2 t2
on t.tweet_id = t2.tweet_id
group by t2.politician;
Here is a db<>fiddle.

Concatenating codes to obtain sum

I've been for tha past 2 days trying to solve this problem but can't even seem to find the right terms to google it.
I have 3 tables.
This one, with client codes that changed:
ActualCode=111111111 PreviousCode=44444444
And these two tables with value 1 and value 2:
PreviousCode=11111111, Value1= 50,00, Value2= 0,00
ActualCode=44444444 , Value1= 0,00, Value2 = 50,00
I need to sum the values for each relation of Previous and Actual codes from the first table.
I.E.
For
ActualCode=11111111, PreviousCode=44444444
I need to be able to get:
Code=11111111 Value1=50,00 Value2=50,00
Looking forward for your answer :D
Thanks,
P
You can join the tables and sum the values:
select c.actualcode,
sum(ac.value1) + sum(pc.value1) as value1,
sum(ac.value2) + sum(pc.value2) as value2
from codes c
join actualcodes ac on c.actualcode = ac.actualcode
join previouscodes pc on c.previouscode = pc.previouscode
group by c.actualcode;
Rextester Demo
If you could have values in the main table that don't have corresponding rows in the values tables, then you should use outer joins instead.

SQL Case with calculation on 2 columns

I have a value table and I need to write a case statement that touches 2 columns: Below is the example
Type State Min Max Value
A TX 2 15 100
A TX 16 30 200
A TX 31+ 500
Let say I have another table that has the following
Type State Weight Value
A TX 14 ?
So when I join the table , I need a case statement that looks at weight from table 2 , type and state - compare it to the table 1 , know that the weight falls between 2 and 15 from row 1 and update Value in table 2 with 100
Is this doable ?
Thanks
It returns 0 if there aren't rows in this range of values.
select Type, State, Weight,
(select coalesce(Value, 0)
from table_b
where table_b.Type = table_a.Type
and table_b.State = table_a.State
and table_a.Value between table_b.Min and table_b.Max) as Value
from table_a
For an Alteryx solution: (1) run both tables into a Join tool, joining on Type and State; (2) Send the output to a Filter tool where you force Weight to be between Min and Max; (3) Send that output to a Select tool, where you grab only the specific columns you want; (since the Join will give you all columns from all tables). Done.
Caveats: the data running from Join to Filter could be large, since you are joining every Type/State combination in the Lookup table to the other table. Depending on the size of your datasets, that might be cumbersome. Alteryx is very fast though, and at least we're limiting on State and Type, so if your datasets aren't too large, this simple solution will work fine.
With larger data, try to do it as part of your original select, utilizing one of the other solutions given here for your SQL query.
Considering that Min and Max columns in first table are of Integer type
You need to use INNER JOIN on ranges
SELECT *
FROM another_table a
JOIN first_table b
ON a.type = b.type
AND a.State = b.State
AND a.Weight BETWEEN b.min AND b.max

SQL Query - Limited by another SQL query of a different data type

I need some help on this one. I have a query that I need to make work but I need to limit it by the results of another query.
SELECT ItemID, ItemNums
FROM dbo.Tables
ItemNums is a varchar field that is used to store the strings of the various item numbers.
This produces the following.
ItemID ItemNums
1 1, 4, 5
2 1, 3, 4, 5
3 2
4 4
5 1
I have another table that has each item number as an INT that I need to use to pull all ItemIDs that have the associated ItemNums
Something like this.
SELECT *
FROM dbo.Tables
WHERE ItemNums IN (4,5)
Any help would be appreciated.
If possible, you should change your database schema. In general, it's not good to store comma delimited lists in a relational database.
However, if that's not an option, here's one way using a join with like:
select *
from dbo.Tables t
join dbo.SecondTable st on ', '+t.ItemNums+',' like '%, '+st.ItemNumId+',%'
This concatenates commas to the beginning and end of the itemnums to ensure you only match on the specific ids.
I personally would recommend normalizing your dbo.tables.
It would be better as:
ItemID ItemNums
1 1
1 4
1 5
2 1
etc.
Then you can use a join or a sub query to pull out the rows with ItemNums in some list.
Otherwise, it's going to be a mess and not very fast.

Comparing two columns with containing one column and an addition

I have an SQL table with a lot of rows. A column in this row is called Label.
The label is a combination of different numbers; example of this is
11-1234-1-1
or
11-1234-12-20
The first two positions are always a combination of 2 (11), after the first delimiter it is always 4 (1234). The third part of the label can be either 1 or 2 values (I.e it can be 1 or 12 or some other random nmr). The fourth part is random and ranging from 1-99
In this table, I also have the exact same values but in the fourth part it leads with 10 or 100 (so the fourth part receives 4 values).
Example of this is: 11-1234-12-1020
11-1234-12-20 and 11-1234-12-1020 are the same.
I want to find all these values where part B contains Part A.
The labels are found in the same column.
I have joined the columns with each other:
SELECT A.LABEL, B.LABEL
FROM TABLE A
JOIN TABLE B ON A.LABEL = B.LABEL
WHERE ??
What should my WHERE-clause be?
I have tried with LIKE and SUBSTRING but I'm missing getting values.
I.e.
WHERE A.LABEL LIKE SUBSTRING(B.LABEL,1,12) + '10' + '%'
Seeing I'm a beginner at this I'm kind of stuck. Help please :)
This should work
SELECT A.LABEL, B.LABEL FROM TABLE A
JOIN TABLE B ON
CASE WHEN LEN(RIGHT(A.LABEL, CHARINDEX('-', reverse(A.LABEL))-1)) = 1
THEN
STUFF(A.LABEL, LEN(A.LABEL) - CHARINDEX('-', reverse(A.LABEL))+1, 1, '-100')
ELSE
STUFF(A.LABEL, LEN(A.LABEL) - CHARINDEX('-', reverse(A.LABEL))+1, 1, '-10')
END = B.LABEL
So basically we find the last position of a - character in the string by reversing the string:
CHARINDEX('-', reverse(A.LABEL)
Then we insert either a 10 or a 100 at that point to compare with the other labels.
You need to do it on the join - remember you are joining two independent sets (tables) and you want the intersection where your pattern matches.
SELECT A.LABEL, B.LABEL
FROM TABLE A
INNER JOIN TABLE B ON B.LABEL LIKE A.LABEL + '%'
Cheers, T