How to index match with conditions in sql - sql

I have tables like this:
regist table
userID
registDate
1
2022-01-22
2
2022-01-23
session table
userID
date_key
traffic
null
2022-01-02
facebook
1
2021-01-03
facebook
1
2021-01-04
google
1
2021-01-05
linkedin
2
2021-01-15
facebook
2
2021-01-25
facebook
3
2021-01-20
facebook
Output
userID
date_key
traffic
regist date
1
2021-01-03
facebook
2022-01-22
1
2021-01-04
google
2022-01-22
1
2021-01-05
linkedin
2022-01-22
2
2021-01-15
facebook
2022-01-23
How do I merge the tables so that I can return the regist date. Do I do a right join?
Is this correct?
select *
from sessiontables st
left join registtable rt on st.userID = rt.userID
where st.userID is not null
How to do exist userID exist in regist table statement?

if I understand correctly, You can try to use self join with an aggregate function.
select rt.userID,
st.date_key,
st.traffic,
rt.registDate
from (
SELECT userID,min(date_key) date_key,traffic
FROM sessiontables
GROUP BY traffic,userID
) st
JOIN registtable rt
ON st.userID=rt.userID

Related

Compare and get missing values from two columns of two different tables

I have two tables named StationUtilization and Process. Both tables have columns TestStart and TestDateTime respectively and should have similar records.
However, there are some missing records in TestStart column of StationUtilization table that needs to be added. How can I compare these two columns to get the missing values?
Example:
StationUtilization Table
ID
TestStart
.....
1
2021-01-01 22:42:23.000
2
2021-01-02 22:42:23.000
3
2021-01-05 22:42:23.000
Process Table:
ID
TestDateTime
.....
1
2021-01-01 22:42:23.000
2
2021-01-02 22:42:23.000
3
2021-01-03 22:42:23.000
4
2021-01-04 22:42:23.000
5
2021-01-05 22:42:23.000
Expected output after comparison:
ID
TestDateTime
.....
3
2021-01-03 22:42:23.000
4
2021-01-04 22:42:23.000
SELECT * FROM StationUtilization
LEFT JOIN Process
ON Process.TestDateTime = StationUtilization.TestStart
WHERE PROCESS.ID is null
NOT EXISTS is one approach:
select p.*
from Process p
where not exists (select 1
from StationUtilization su
where p.TestDateTime = su.TestStart
);

calculate partition from two table and using index less than or equals of index in bigquery

i have 2 table, first table is a main table that i want to join and sum partition to second table.
the first table is : main_table
Month
Product
MOB
2020-12-01
B2B
1
2020-12-01
B2B
2
2021-01-01
B2B
1
2020-11-01
B2C
1
2020-11-01
B2C
2
2020-11-01
B2C
3
second table is : second_table
month
Product
MOB
amount
2020-12-01
B2B
0
100
2020-12-01
B2B
2
100
2021-01-01
B2B
1
50
2020-11-01
B2C
-2
50
2020-11-01
B2C
1
55
2020-11-01
B2C
3
100
my expectation result is
Month
Product
MOB
partition_amount
2020-12-01
B2B
1
100
2020-12-01
B2B
2
200
2021-01-01
B2B
1
50
2020-11-01
B2C
1
105
2020-11-01
B2C
2
105
2020-11-01
B2C
3
205
how to calculate partition_amount is when main_table.Month=second_table.Month and main_table.product=second_table.product and the partition is sum of second_table.amount by mob. it would be calculate when second_table.mob <= main_table.mob
anyone can help me to write the query use big query ?
it would be calculate when second_table.mob <= main_table.mob
One method is join and aggregation:
select m.month, m.product, m.mob, sum(s.partition_amount)
from main_table m join
second_table s
on s.month = m.month and
s.product = m.product and
s.mob <= m.mob
group by 1, 2, 3;
Consider below
select any_value(main_table).*,
sum(if(second_table.mob <= main_table.mob, amount, 0)) as partition_amount
from `project.dataset.main_table` main_table
left join `project.dataset.second_table` second_table
using(month, product)
group by format('%t', main_table)
if applied to sample data in your question - output is

Creating a new calculated column in SQL

Is there a way to find the solution so that I need for 2 days, there are 2 UD's because there are June 24 2 times and for the rest there are single days.
I am showing the expected output here:
Primary key UD Date
-------------------------------------------
1 123 2015-06-24 00:00:00.000
6 456 2015-06-24 00:00:00.000
2 123 2015-06-25 00:00:00.000
3 658 2015-06-26 00:00:00.000
4 598 2015-06-27 00:00:00.000
5 156 2015-06-28 00:00:00.000
No of times Number of days
-----------------------------
4 1
2 2
The logic is 4 users are there who used the application on 1 day and there are 2 userd who used the application on 2 days
You can use two levels of aggregation:
select cnt, count(*)
from (select date, count(*) as cnt
from t
group by date
) d
group by cnt
order by cnt desc;

SQL update from one Table A to table B based on match 2 ID based on aggregation of points

trying to update the value to table(final) from table(first_stage), both of these tables have the same field name but different values, tables contents are:
(Min_Date) date,
(Max_Date) date,
(NoofDays) int,
(IMSI) string,
(Site) string,
(Down_Link) int,
(Up_Link) int,
(Connection) int
based on IMSI and Site, if it exists on the table row then take the minimum date as Min_Date and Maximum date as Max_Date and get the
min(Min_Date),max(Max_Date)sum(NoofDays),sum(Down_Link),sum(up_Link),sum(connection)
and if the both of row id's are not matched(IMSI,Site) with table (Final) then insert the row into final table. I'm still newbee with sql
table first_stage:
MinDate Max_Date NoofDays IMSI Site Down_link Up_link Connection
2019-03-22 2019-03-26 1 222 google 1 1 1
2019-03-26 2019-03-27 3 222 youtube 1 1 1
2019-03-02 2019-03-27 5 333 facebook 2 3 1
2019-03-02 2019-03-27 5 111 facebook 20 33 11
table final:
MinDate Max_Date NoofDays IMSI Site Down_link Up_link Connection
2019-03-01 2019-03-27 1 222 google 2 2 1
2019-03-12 2019-03-25 1 222 youtube 2 2 2
2019-03-25 2019-03-27 4 333 facebook 3 6 1
it must matched with both IMSI and Site to make update statement, final table after update must be look like as the below:
table final:
MinDate Max_Date NoofDays IMSI Site Down_link Up_link Connection
2019-03-01 2019-03-27 2 222 google 3 3 2
2019-03-12 2019-03-27 4 222 youtube 3 3 3
2019-03-02 2019-03-27 9 333 facebook 5 9 2
2019-03-02 2019-03-27 5 111 facebook 20 33 11
I have never worked with vertica, but I think this might work:
MERGE
INTO FINAL
USING FIRST_STAGE
ON IMSI = FIRST_STAGE.IMSI and Site = FIRST_STAGE.Site
WHEN MATCHED THEN UPDATE SET
Min_Date = least(FIRST_STAGE.Min_Date, Min_Date),
Max_Date = greatest(FIRST_STAGE.Max_Date, Max_Date),
NoofDays = FIRST_STAGE.NoofDays + NoofDays,
Down_Link = FIRST_STAGE.Down_Link + Down_Link,
up_Link = FIRST_STAGE.up_Link + up_Link,
connection = FIRST_STAGE.connection + connection
WHEN NOT MATCHED THEN INSERT ( Min_Date,
Max_Date,
NoofDays,
IMSI,
Site,
Down_Link,
Up_Link,
Connection )
VALUES ( FIRST_STAGE.Min_Date,
FIRST_STAGE.Max_Date,
FIRST_STAGE.NoofDays,
FIRST_STAGE.IMSI,
FIRST_STAGE.Site,
FIRST_STAGE.Down_Link,
FIRST_STAGE.Up_Link,
FIRST_STAGE.Connection )

SQL - Datediff between rows with Rank Applied

I am trying to work out how to to apply a datediff between rows where a rank is applied to the USER ID;
Example of how the data below;
UserID Order Number ScanDateStart ScanDateEnd Minute Difference Rank | Minute Difference Rank vs Rank+1
User1 10-24 10:20:00 10:40:00 20 1 | 5
User1 10-25 10:45:00 10:50:00 5 2 | 33
User1 10-26 11:12:00 11:45:00 33 3 | NULL
User2 10-10 00:09:00 00:09:20 20 1 | 4
User2 10-11 00:09:24 00:09:25 1 2 | 15
User2 10-12 00:09:40 00:10:12 32 3 | 3
User2 10-13 00:10:15 00:10:35 20 4 | NULL
What i'm looking for is how to code the final column of this table.
The rank is applied to UserID ordered by ScanDateStart.
Basically, i want to know the time between the ScanDateEnd of Rank 1, to ScanDateStart of Rank2, and so on, but for each user.... (calculating time between order processing etc)
Appreciate the help
This can be achieved by performing a LEFT JOIN to the same table on the UserID column and the Rank column, plus 1.
The following (simplified) pseudo-code should illustrate how to achieve this:
SELECT R.UserID,
R.Rank,
R1.Diff
FROM Rank R
LEFT JOIN Rank R1 ON R1.UserID = R.UserID AND R1.Rank = R.Rank + 1
Effectively, you are showing the UserID and Rank from the current row, but the Difference from the row of the same UserID with the Rank + 1.