combine results from different selects - sql

I have one table that contains a field "ID", "mailSent" and "serviceUsed". "mailSent" contains the time when a mail was sent and "serviceUsed" contains a counter that just says if the user has used the service for the particular mail that I have sent.
I am trying to do a report that gives me back for each ID the following two facts:
1. The last time when a user has used the service, i.e., the time when for a particular user serviceUsed != 0
2. The total number of times a user has used the service, i.e., sum(serviceUsed) for each user
I would like to display this in one view and map the result always to the particular user. I can build each of the two queries separately but do not know how to combine it into one view. The two queries look as follows:
1. Select ID, max(mailSent) from Mails where serviceUsed > 0 group by ID
2. Select ID, sum(serviceUsed) from Mails group by ID
Notice that I cannot just combine them both because I also want to show the IDs that have never used my service, i.e., where serviceUsed = 0. Hence, if I just eliminate the where clause in my first query, then I will get wrong results for max(mailSent). Any idea how I can combine both?
In other words what I want is then something like this:
ID, max(mailSent), sum(serviceUsed)
where max(mailSent) is from the first query and sum(serviceUsed) from the second query.
Regards!

Try like this
SELECT * FROM
(
Select ID, max(mailSent) from Mails where serviceUsed > 0 group by ID
UNOIN ALL
Select ID, sum(serviceUsed) from Mails group by ID
) AS T

You can write it within one Query:
SELECT ID, sum(serviceUsed), max(mailSent) from Mails group by ID;
The problem, that you don't have the serviceUsed > 0 in your second Query doesn't matter. You can sum them up too, because they have the value 0.
If you have the following input:
id serviceUsed mailSent
--------------------------
1 0 1.1.1970
1 4 3.1.1970
1 3 4.1.1970
2 0 2.1.1970
The Query should return this result:
id serviceUsed mailSent
--------------------------
1 7 4.1.1970
2 0 2.1.1970
But I wonder, where your primary key is?

You want to do this with conditional aggregation:
select ID, max(case when serviceUsed > 0 then mailSent end),
sum(serviceUsed)
from Mails
group by ID;

Related

Is there possibility to assign 0 for another occurence of the same user?

I have table like this:
And lets say I want to have in another occurence of login eg. 1234 i would like to have instead of 275 value of 0, the same for 3678 instead of 300 i would like to have 0. I want to have profit/loss only for first occurence in the table. Is this somehow possible to do it in SQL?
If you only want to keep the highest / lowest, you can join the table with itself.
If you only want the first (as in lowest row number) occurance you can use a window function and update all occurances > 1
UPDATE SUB
set profit = 0
FROM (
SELECT
profit,
ROW_NUMBER() OVER (PARTITION BY Login ORDER BY Login) as cnt
FROM table
) SUB
WHERE SUB.cnt > 1
In Standard SQL, you can use:
update t
set profit = 0
where profit > (select max(profit) from t t2 where t2.login = t.login);
Note that specific databases may have alternative ways to writing this. However, your question does not have a database tag.
Also, this assumes that your table does not have duplicates. Unfortunately, duplicates would be problematic if these are the only two columns in the table.

Select Rows and Flag Column if value found

i need help for creating a query
Current situation:
The output table contains Ids and levels. Each Id can appear several times.
Problem:
Now I want to know if level 1 appears for an id, if so I want to mark it as number 1. If an Id has only level 2 or zero just mark it as number 0.
The output can be taken from the table below.
Use aggregation:
select id,
max(case when level = 1 then 1 else 0 end) as flag
from t
group by id;

SQL Rows to Columns if column values are unknown

I have a table that has demographic information about a set of users which looks like this:
User_id Category IsMember
1 College 1
1 Married 0
1 Employed 1
1 Has_Kids 1
2 College 0
2 Married 1
2 Employed 1
3 College 0
3 Employed 0
The result set I want is a table that looks like this:
User_Id|College|Married|Employed|Has_Kids
1 1 0 1 1
2 0 1 1 0
3 0 0 0 0
In other words, the table indicates the presence or absence of a category for each user. Sometimes the user will have a category where the value if false, sometimes the user will have no row for a category, in which case IsMember is assumed to be false.
Also, from time to time additional categories will be added to the data set, and I'm wondering if its possible to do this query without knowing up front all the possible category names, in other words, I won't be able to specify all the column names I want to count in the result. (Note only user 1 has category "has_kids" and user 3 is missing a row for category "married"
(using Postgres)
Thanks.
You can use jsonb funcions.
with titles as (
select jsonb_object_agg(Category, Category) as titles,
jsonb_object_agg(Category, -1) as defaults
from demog
),
the_rows as (
select null::bigint as id, titles as data
from titles
union
select User_id, defaults || jsonb_object_agg(Category, IsMember)
from demog, titles
group by User_id, defaults
)
select id, string_agg(value, '|' order by key)
from (
select id, key, value
from the_rows, jsonb_each_text(data)
) x
group by id
order by id nulls first
You can see a running example in http://rextester.com/QEGT70842
You can replace -1 with 0 for the default value and '|' with ',' for the separator.
You can install tablefunc module and use the crosstab function.
https://www.postgresql.org/docs/9.1/static/tablefunc.html
I found a Postgres function script called colpivot here which does the trick. Ran the script to create the function, then created the table in one statement:
select colpivot ('_pivoted', 'select * from user_categories', array['user_id'],
array ['category'], '#.is_member', null);

SQL : Check if result = number for each id

I have this sort of structure
ID STATUS
1 FIRSTSTAT
2 FIRSTSTAT
3 FIRSTSTAT
1 SECSTAT
3 SECSTAT
3 THIRDSTAT
3 FOURTHSTAT
3 FIFTHSTAT
I want to get the 3 back because he has all following status (FIRSTSTAT, SECSTAT, THIRDSTAT). Do you have an idea how I could make that?
It should be done by explicitely giving the statuses because other statuses exist so SELECT FROM WHERE = 'THIRDSTAT' is not ok since it should have all three statuses, not only one of them.
So I guess it should be done calculating the SUM or something like that.
I tried the following but of course, it does not work :
SELECT
FROM
WHERE
AND
AND
If the number of different status values is known to always be 3:
select id
from tablename
where status in ('FIRSTSTAT', 'SECSTAT', 'THIRDSTAT')
group by id
having count(distinct status) = 3

Trouble Finding ID's with Duplicate Fields

My data looks like this:
ID Email
1 someone#hotmail.com
2 someone1#hotmail.com
3 someone2#hotmail.com
4 someone3#hotmail.com
5 someone4#hotmail.com
6 someone5#hotmail.com
There should be exactly 1 email per ID, but there's not.
> dim(data)
[1] 5071 2
> length(unique(data$Person_Onyx_Id))
[1] 5071
> length((data$Email))
[1] 5071
> length(unique(data$Email))
[1] 4481
So, I need to find the ID's with duplicated email addresses.
Seems like this should be easy, but I'm striking out:
> sqldf("select ID, count(Email) from data group by ID having count(Email) > 1")
[1] ID count(Email)
<0 rows> (or 0-length row.names)
I've also tried taking off the having clause and sending the result to an object and sorting the object by the count(Email)... it appears that every ID has count(Email) of 1...
I would dput the actual data but I can't due to the sensitivity of email addresses.
Are you also sure you don't have the opposite condition, multiple ids with the same email?
select Email, count(*)
from data
group by Email
having count(*) > 1;
My guess is that you have NULL emails. You could find this by using count(*) rather than count(email):
select ID, count(*)
from data
group by ID
having count(*) > 1;