I have the following table Employee storing any updates made on an employee:
EmployeeId DepartmentId Status From To
44 30 Recruited 01/01/2017 06/03/2017
44 56 IN 07/03/2017 07/03/2018
44 67 IN 06/05/2018 06/09/2018
44 33 IN 07/09/2018 02/02/2019
44 33 OUT 03/02/2019 31/12/2019
44 45 Recruited 01/02/2020 03/02/2020
44 45 IN 04/02/2020 NULL
I want to count how many times each employee has changed his department knowing that the employee life cycle is like below : Recuited - IN - OUT
and the employees that left the company then went back to it like in this example.
I'm not sure what "Recruited", "In" and "Out" have to do with this. If each row represents a period of time when an employee was in a department, then use lag() to measure changes:
select employeeId, count(*)
from (select t.*,
lag(departmentId) over (partition by employeeId order by from_date) as prev_departmentId
from t
) t
where prev_departmentId is null or prev_departmentId <> departmentId
group by employeeId;
Related
Tables - Store
Stores
Date
Customer_ID
A
01/01/2020
1111
C
01/01/2020
1111
F
02/01/2020
1234
A
02/01/2020
1111
A
02/01/2020
2222
Tables - Customer
Customer_ID
Age_Group
Income_Level
1111
26-30
Low
1234
25 and below
Mid
2222
31-60
High
I want to know how I can get this output.
Stores
Age_Group
Percentage_by_Age
Income_Level
Percentage_By_Income
A
25 and below
10
Low
80
A
25 and below
10
Mid
10
A
25 and below
10
High
10
A
26 - 30
42
Low
15
A
26 - 30
42
Mid
65
A
26 - 30
42
High
20
A
31 - 60
48
Low
30
A
31 - 60
48
Mid
50
A
31 - 60
48
High
20
I am using SQL to query from different tables.
First I need to aggregate the number of customers by stores, then in each store, I want to find out how many customers visited Store A in a particular age group(25 and below), and how many of them are in which income level.
May I know how I can go about solving this query?
Thanks.
My current solution/thought process
SELECT
stores AS Stores,
Age_Group AS Age,
Income_Level AS Income
COUNT(DISTINCT(Customer_ID)) AS Number_of_Customers
FROM tables JOIN tables....
GROUP BY Stores, Ages, Income;
And then manually calculating the percentages.
But it doesn't seem right.
Is there a way to produce an example output table using just SQL?
As per your requirement, Common Table Expressions can be used . You can use below code to get the expected output.
WITH
data_for_percent_by_income AS (
SELECT
COUNT(customer_id) AS cus_count_in_per_income_level_and_agegrp,
Age_group AS age_g,income_level AS inc_lvl
FROM
`project.dataset.Customer2`
WHERE
customer_id IN (
SELECT customer_id
FROM
`project.dataset.Store5`
WHERE stores='A')
GROUP BY
Age_group,income_level),tot_cus_in_defined_income_level AS (
SELECT
COUNT(customer_id) AS cus_count_in_per_income_level,Age_group AS ag
FROM
`project.dataset.Customer2`
WHERE
customer_id IN (
SELECT
customer_id
FROM
`project.dataset.Store5`
WHERE stores='A')
GROUP BY
Age_group),
tot_cus_storeA AS(
SELECT
COUNT(*) AS tot_cus_in_A
FROM
`project.dataset.Customer2`
WHERE customer_id IN (
SELECT customer_id
FROM
`project.dataset.Store5`
WHERE stores='A') ),
final_view AS(
SELECT
ROUND(cus_count_in_per_income_level_and_agegrp*100/cus_count_in_per_income_level) AS p_by_inc,
age_g,inc_lvl
FROM
data_for_percent_by_income
INNER JOIN
tot_cus_in_defined_income_level
ON
data_for_percent_by_income.age_g=tot_cus_in_defined_income_level.ag )
SELECT
stores,tot_cus_in_defined_income_level.ag AS age_group,income_level,
ROUND(cus_count_in_per_income_level*100/tot_cus_in_A) AS percentage_by_age,
p_by_inc AS percentage_by_income
FROM
tot_cus_in_defined_income_level,tot_cus_storeA,`project.dataset.Customer2`,`project.dataset.Store5`
INNER JOIN
final_view
ON
age_group=final_view.age_g AND income_level=final_view.inc_lvl
WHERE
tot_cus_in_defined_income_level.ag = Age_group AND stores='A'
GROUP BY
stores,percentage_by_age,age_group,income_level,percentage_by_income
ORDER BY Age_group
I have attached the screenshots of the input table and output table.
Customer Table
Store Table
Output Table
SELECT
s.Stores AS Stores,
c.age_group AS Age,
a.income_level AS Affluence,
CAST(COUNT(DISTINCT c.Customer_ID) AS numeric)*100/SUM(CAST(COUNT(DISTINCT c.Customer_ID) AS numeric)) OVER(PARTITION BY s.Stores ) AS Perc_of_Members
This is what I did in the end.
I have 2 tables in the following way
Table 1:
e_id e_name e_salary e_age e_gender e_dept
---------------------------------------------------
1 sam 95000 45 male operations
2 bob 80000 21 male support
3 ann 125000 25 female analyst
Table 2:
d_salary d_age d_gender e_dept
----------------------------------
34000 25 male Admin
56000 41 female Tech
77000 35 female HR
I want the output something like this:
e_id e_name e_salary e_age e_gender e_dept d_salary d_age d_gender e_dept
1 sam 95000 45 male operations 34000 25 male Admin
2 bob 80000 21 male support 56000 41 female Tech
3 ann 125000 25 female analysts 77000 35 female HR
There is no dependency between the tables. No common columns. No primary or foreign key.
I tried using cross join that results in duplicate rows because it works on M X N
I am new to this SQL thing. Can someone help me, please? Thanks in advance
Though I didn't get the reason behind your desired output but you can get that with below query:
select a.e_id ,a.e_name ,a.e_salary ,a.e_age ,a.e_gender ,a.e_dept,b.d_salary ,b.d_age ,b.d_gender ,b.e_dept
from
(select e_id ,e_name ,e_salary ,e_age ,e_gender ,e_dept, row_number()over(order by e_id)rn
from table1)a
inner join
(select d_salary d_age d_gender e_dept,row_number()over(order by d_salary) rn
from table 2) b
on a.rn=b.rn
Generally you can create a row count using the row_number() window function on both tables and use this as join criterion. But this requires a certain order for both tables, which means that you have explicitly tell the query why is the Admin record ordered first and must be joined on the first record of table 1:
SELECT
*
FROM (
SELECT
*,
row_number() OVER (ORDER BY e_id) as row_count -- assuming e_id is your order criterion
FROM table1
) t1
JOIN (
SELECT
*,
row_number() OVER (ORDER BY /*whatever you expect to be ordered*/) as row_count
FROM table2
) t2
ON t1.row_count = t2.row_count
I have a selection that returns
EMP DOC DATE
1 78 01/01
1 96 02/01
1 96 02/01
1 105 07/01
2 4 04/01
2 7 04/01
3 45 07/01
3 45 07/01
3 67 09/01
And i want to add a row number (il'l use it as a primary id) but i want it to change always when the "EMP" changes, and also won't change when the doc is same as previous one like:
EMP DOC DATE ID
1 78 01/01 1
1 96 02/01 2
1 96 02/01 2
1 105 07/01 3
2 4 04/01 1
2 7 04/01 2
3 45 07/01 1
3 45 07/01 1
3 67 09/01 2
In SQL Server I could use LAG to compare previous DOC but I can't seem to find a way into SYBASE SQL Anywhere, I'm using ROW_NUMBER to partitions by the "EMP", but it's not what I need.
SELECT EMP, DOC, DATE, ROW_NUMBER() OVER (PARTITION BY EMP ORDER BY EMP, DOC, DATE) ID -- <== THIS WILL CHANGE THE ROW NUMBER ON SAME DOC ON SAME EMP, SO WOULD NOT WORK.
Anyone have a direction for this?
You sem to want dense_rank():
select
emp,
doc,
date,
dense_rank() over(partition by emp order by date) id
from mytable
This numbers rows within groups having the same emp, and increments only when date changes, without gaps.
if performance is not a issue in your case, you can try sth. like:
SELECT tx.EMP, tx.DOC, tx.DATE, y.ID
FROM table_xxx tx
join y on tx.EMP = y.EMP and tx.DOC = y.DOC
(SELECT EMP, DOC, ROW_NUMBER() OVER (PARTITION BY EMP ORDER BY DOC) ID
FROM(SELECT EMP, DOC FROM table_xxx GROUP BY EMP, DOC)x)y
I have a table in which different Clients are assign to different MC. Like Client (84) switch the MC 3 times. Now I want to get the latest MC of Client=84. I make this Query
select max(cstmrMC.CustMCId),cstmrMC.CustId,cstmrMC.MCID,cstmrMC.AssignDate
from CustomerMC cstmrMC
where cstmrMC.CustId=84
group by cstmrMC.CustMCId,cstmrMC.CustId,cstmrMC.MCID,cstmrMC.AssignDate
ORDER BY cstmrMC.CustMCId,cstmrMC.CustId,cstmrMC.MCID,cstmrMC.AssignDate
which shows this result
CustMCId CustId MCID AssignDate
52 84 18 2013-10-01 18:21:56.000
59 84 7 2013-09-09 16:10:06.000
80 84 19 2013-10-09 03:54:00.000
156 84 21 2013-11-11 00:00:00.000
NOw I want only this
156 84 21 2013-11-11 00:00:00.000
How can I get this result????
Use the row_number function to partition and order the customers so that the most recent MCID (based on AssignDate) is first within each customer.
WITH cteCustomers AS (
SELECT CustMCId, CustId, MCID, AssignDate,
ROW_NUMBER() OVER(PARTITION BY CustId ORDER BY AssignDate DESC) AS RowNum
FROM CustomerMC
)
SELECT CustMCId, CustId, MCID, AssignDate
FROM cteCustomers
WHERE RowNum = 1;
I'm fairly new to mysql and need a query I just can't figure out. Given a table like so:
emp cat date amt cum
44 e1 2009-01-01 1 1
44 e2 2009-01-02 2 2
44 e1 2009-01-03 3 4
44 e1 2009-01-07 5 9
44 e7 2009-01-04 5 5
44 e2 2009-01-04 3 5
44 e7 2009-01-05 1 6
55 e7 2009-01-02 2 2
55 e1 2009-01-05 4 4
55 e7 2009-01-03 4 6
I need to select the latest date transaction per 'emp' and per 'cat'. The above table would produce something like:
emp cat date amt cum
44 e1 2009-01-07 5 9
44 e2 2009-01-04 3 5
44 e7 2009-01-05 1 6
55 e1 2009-01-05 4 4
55 e7 2009-01-03 4 6
I've tried something like:
select * from orders where emp=44 and category='e1' order by date desc limit 1;
select * from orders where emp=44 and category='e2' order by date desc limit 1;
....
but this doesn't feel right. Can anyone point me in the right direction?
This should work, but I haven't tested it.
SELECT orders.* FROM orders
INNER JOIN (
SELECT emp, cat, MAX(date) date
FROM orders
GROUP BY emp, cat
) criteria USING (emp, cat, date)
Basically, this uses a subquery to get the latest entry for each emp and cat, then joins that against the original table to get all the data for that order (since you can't GROUP BY amt and cum).
The answer given by #R.Bemrose should work, and here's another trick for comparison:
SELECT o1.*
FROM orders o1
LEFT OUTER JOIN orders o2
ON (o1.emp = o2.emp AND o1.cat = o2.cat AND o1.date < o2.date)
WHERE o2.emp IS NULL;
This assumes that the columns (emp, cat, date) comprise a candidate key. I.e. there can be only one date for a given pair of emp & cat.