Joining two into one SSIS - sql

does anyone know to work this out?
i'm a dummie with SSIS, i have a derived column with WomenID, MenID, Date, and Status.
The thing is that i need to "join" WomenID and MenID into one (IDs) keeping the date and status, for example:
WomenID| MenID| Date | Status
123 | 345 | 20160819 | M
768 | 762 | 19870830 | S
and need to turn it into
ID |Date |Status
123 |20160819 | M
768 |19870830 | S
345 |20160819 | M
762 |19870830 | S
I know that this is a trival question but can't see the light with this one.

One option uses a UNION:
SELECT WomenID AS ID, Date, Status FROM yourTable
UNION ALL
SELECT MenID, Date, Status FROM yourTable
If you want the exact ordering which you are showing us, we need to do more work. A computed column is one way to go:
WITH cte AS (
SELECT WomenID AS ID, Date, Status, 0 AS position FROM yourTable
UNION ALL
SELECT MenID, Date, Status, 1 FROM yourTable
)
SELECT ID, Date, Status
FROM cte
ORDER BY position, Status;
Demo

This should help:
select womenid as Id, Date, status where status=‘F’
Union
Select menid as Id, Date, status where status=‘M’
Hope it helps

Related

How to query to capture recency and scale in one query?

I've built a query that calculates the number of ids from a table, per url_count.
with cte as (
select id, count(distinct.url) url_count
from table
group by id
)
select sum(if(url_count >= 1,1,0) scale
from cte
union all
select sum(if(url_count >= 2,1,0) scale
from cte
union all
select sum(if(url_count >= 3,1,0) scale
from cte
union all
select sum(if(url_count >= 4,1,0) scale
from cte
union all
select sum(if(url_count >= 5,1,0) scale
from cte
The query above says; "Give me the list of ids and the number of urls they each go to, then accumulate the number of ids who have gone to [1-5] or more urls"
It's ofc a tedious method, but works and outputs something like;
---------
| scale |
---------
|1213432|
|867554 |
|523523 |
|342232 |
|145889 |
---------
From this table, I also have a date field on the last 5 days which I'm working on adding into this query. Thus lies the challenge; Trying to add a second layer of information to the query; i.e. Recency. Been working on multiple approaches to building a query that outputs all the combinations of different scales, per the date.
The sort of output I've imagined is a pivot table which presents something like;
-------------------------------------------------------------
| date | url_co1 | url_co2 | url_co3 | url_co4 | url_co5|
-------------------------------------------------------------
|2020-01-05| 1213432 | 1112321 | 984332 | 632131 | 234124 |
|2020-01-04| 1012131 | 934242 | 867554 | 533242 | 134234 |
| ... | ... | ... | ... | ... | ... |
| ... | ... | ... | ... | ... | ... |
| ... | ... | ... | ... | ... | ... |
-------------------------------------------------------------
Where url_co[1-5] represents the number of ids that visited [1-5] or more urls and dates gives up the date that volume was captured. No idea how to write that because once I query:
with cte as (
select id, date, count(distinct.url) url_count
from table
group by id, date
)
I've aggregated to per id, per date, which therefore something goes wrong. =/
Hope that all made sense!
Please, please help! I would appreciate some guidance.
There must be a methodology for getting the combination of volumes per recency that I've missed!
I don't really follow the full question, but the first query can be simplified to:
select url_count, count(*) as this_count,
sum(url_count) over (order by url_count desc) as descending_count
from (select id, count(distinct url) as url_count
from table
group by id
) t
group by url_count
order by url_count;

PostgreSQL: How to write a query for this scenario

I have this below table.
+_______+________+__________+________+
|Playid |billid| amount | Date |
+_______+________+__________+________+
|123 | 345 | 144.9 | 2015-09|
|123 | 456 | 200 | 2015-10|
+_______+________+__________+________+
I need to write a query to show only the bill amount that has most recent transaction date (Date) like below.
+_______+________+__________+________+
|Playid |billid| amount | Date |
+_______+________+__________+________+
|123 | 456 | 200 | 2015-10|
+_______+________+__________+________+
Please help me how do I do it.
MAX(Date) can be used if you want to display only the playid and the most recent date.
However, The issue with what you are trying to do, is that you want to display all the columns. And this where the ranking functions come into play. In this case you can use the row_number function like this:
SELECT PlayId, billid, amount, date
FROM
(
SELECT
PlayId, billid, amount, date,
row_number() over(partition by playid order by date dec) as rn
FROM tablename
) t
where rn = 1
The row_number() over(partition by playid order by date dec) will give each group of playid a ranking number, the first one (the lowest one) will be the one with the most recent date. Then you just need to filter on the row number equal to 1.
Postgres offers distinct on. This is simpler to write and often has the best performance:
select distinct on (playid) t.*
from t
order by playid, order by date desc;

how to exclude the most recent null field from query result?

I want to design a query to find out is there at least one cat (select count(*) where rownum = 1) that haven't been checked out.
One weird condition is that the result should exclude if the most recent cat that didn't checked out, so that:
TABLE schedule
-------------------------------------
| type | checkin | checkout
-------------------------------------
| cat | 20:10 | (null)
| dog | 19:35 | (null)
| dog | 19:35 | (null)
| cat | 15:31 | (null) ----> exclude this cat in this scenario
| dog | 12:47 | 13:17
| dog | 10:12 | 12:45
| cat | 08:27 | 11:36
should return 1, the first record
| cat | 20:10 | (null)
I kind of create the query like
select * from schedule where type = 'cat' and checkout is null order by checkin desc
however this query does not resolve the exclusion. I can sure handle it in the service layer like java, but just wondering any solution can design in the query and with good performance when there is large amount of data in the table ( checkin and checkout are indexed but not type)
How about this?
Select *
From schedule
Where type='cat' and checkin=(select max(checkin) from schedule where type='cat' and checkout is null);
Assuming the checkin and checkout data type is string (which it shouldn't be, it should be DATE), to_char(checkin, 'hh24:mi') will create a value of the proper data type, DATE, assuming the first day of the current month as the "date" portion. It shouldn't matter to you, since presumably all the hours are from the same date. If in fact checkin/out are in the proper DATE data type, you don't need the to_date() call in order by (in two places).
I left out the checkout column from the output, since you are only looking for the rows with null in that column, so including it would provide no information. I would have left out type as well, but perhaps you'll want to have this for cats AND dogs at some later time...
with
schedule( type, checkin, checkout ) as (
select 'cat', '20:10', null from dual union all
select 'dog', '19:35', null from dual union all
select 'dog', '19:35', null from dual union all
select 'cat', '15:31', null from dual union all
select 'dog', '12:47', '13:17' from dual union all
select 'dog', '10:12', '12:45' from dual union all
select 'cat', '08:27', '11:36' from dual
)
-- end of test data; actual solution (SQL query) begins below this line
select type, checkin
from ( select type, checkin,
row_number() over (order by to_date(checkin, 'hh24:mi')) as rn
from schedule
where type = 'cat' and checkout is null
)
where rn > 1
order by to_date(checkin, 'hh24:mi') -- ORDER BY is optional
;
TYPE CHECKIN
---- -------
cat 20:10

PostgreSQL return multiple rows with DISTINCT though only latest date per second column

Lets says I have the following database table (date truncated for example only, two 'id_' preix columns join with other tables)...
+-----------+---------+------+--------------------+-------+
| id_table1 | id_tab2 | date | description | price |
+-----------+---------+------+--------------------+-------+
| 1 | 11 | 2014 | man-eating-waffles | 1.46 |
+-----------+---------+------+--------------------+-------+
| 2 | 22 | 2014 | Flying Shoes | 8.99 |
+-----------+---------+------+--------------------+-------+
| 3 | 44 | 2015 | Flying Shoes | 12.99 |
+-----------+---------+------+--------------------+-------+
...and I have a query like the following...
SELECT id, date, description FROM inventory ORDER BY date ASC;
How do I SELECT all the descriptions, but only once each while simultaneously only the latest year for that description? So I need the database query to return the first and last row from the sample data above; the second it not returned because the last row has a later date.
Postgres has something called distinct on. This is usually more efficient than using window functions. So, an alternative method would be:
SELECT distinct on (description) id, date, description
FROM inventory
ORDER BY description, date desc;
The row_number window function should do the trick:
SELECT id, date, description
FROM (SELECT id, date, description,
ROW_NUMBER() OVER (PARTITION BY description
ORDER BY date DESC) AS rn
FROM inventory) t
WHERE rn = 1
ORDER BY date ASC;

SQL - Select unique rows from a group of results

I have wrecked my brain on this problem for quite some time. I've also reviewed other questions but was unsuccessful.
The problem I have is, I have a list of results/table that has multiple rows with columns
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 005DTHGP | 1966 | 2006-09-12 | Tracker
| 013DTHGP | 2281 | 2006-11-01 | Tracker
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 2404 | 2006-10-20 | Tracker
| 017DTNGP | 508 | 2007-11-10 | MBio
I am trying to select rows with unique REGISTRATIONS and where the DATE is max (the latest). The IDs are not proportional to the DATE, meaning the ID could be a low value yet the DATE is higher than the other matching row and vise-versa. Therefore I can't use MAX() on both the DATE and ID and grouping just doesn't seem to work.
The results I want are as follows;
| REGISTRATION | ID | DATE | UNITTYPE
| 005DTHGP | 172 | 2007-09-11 | MBio
| 013DTHGP | 2712 | 2008-05-30 | MBio
| 017DTNGP | 508 | 2007-11-10 | MBio
PLEASE HELP!!!?!?!?!?!?!?
You want embedded queries, which not all SQLs support. In t-sql you'd have something like
select r.registration, r.recent, t.id, t.unittype
from (
select registration, max([date]) recent
from #tmp
group by
registration
) r
left outer join
#tmp t
on r.recent = t.[date]
and r.registration = t.registration
TSQL:
declare #R table
(
Registration varchar(16),
ID int,
Date datetime,
UnitType varchar(16)
)
insert into #R values ('A','1','20090824','A')
insert into #R values ('A','2','20090825','B')
select R.Registration,R.ID,R.UnitType,R.Date from #R R
inner join
(select Registration,Max(Date) as Date from #R group by Registration) M
on R.Registration = M.Registration and R.Date = M.Date
This can be inefficient if you have thousands of rows in your table depending upon how the query is executed (i.e. if it is a rowscan and then a select per row).
In PostgreSQL, and assuming your data is indexed so that a sort isn't needed (or there are so few rows you don't mind a sort):
select distinct on (registration), * from whatever order by registration,"date" desc;
Taking each row in registration and descending date order, you will get the latest date for each registration first. DISTINCT throws away the duplicate registrations that follow.
select registration,ID,date,unittype
from your_table
where (registration, date) IN (select registration,max(date)
from your_table
group by registration)
This should work in MySQL:
SELECT registration, id, date, unittype FROM
(SELECT registration AS temp_reg, MAX(date) as temp_date
FROM table_name GROUP BY registration) AS temp_table
WHERE registration=temp_reg and date=temp_date
The idea is to use a subquery in a FROM clause which throws up a single row containing the correct date and registration (the fields subjected to a group); then use the correct date and registration in a WHERE clause to fetch the other fields of the same row.