How can I count field value based on another field?

How can I count field value based on another field? - sql

I have EmailEvent table. This is STI table which stores opens and clicks (by using type attribute). I have right now this code which provides information on how many emails use recipient address (email_address field):
EmailEvent.where("type = 'OpenEmailEvent' OR type = 'ClickEmailEvent'")
.where(email_id: email_ids).happened_between_dates(start_date, end_date)
.select("email_address, count(distinct email_id) as emails_count")
.group(:email_address).reorder(nil)
It generates such code:
SELECT email_address, count(distinct email_id) as emails_count
FROM "email_events"
WHERE "email_events"."deleted_at" IS NULL
AND (type = 'OpenEmailEvent' OR type = 'ClickEmailEvent')
AND "email_events"."email_id" IN (85487, 75554, 85489, 77184, 78562, 75587, 82879, 85535, 85534)
AND (created_at between '2017-02-28 22:52:01.000000' AND '2017-03-29 05:59:59.000000')
GROUP BY "email_events"."email_address"
I need to provide information on how many opens and clicks there per email_address. I can't use count(id) because it'll give me both opens and clicks so I need to use type attribute somehow but I can't figure out exactly how it should work. Can somebody give me suggestion what should I do here?

replace this
count(distinct email_id) as emails_count
to something like in SQLServer using case or IIF
sum(case when type = 'OpenEmailEvent' then 1 else 0 end) [Opens],
sum(case when type = 'ClickEmailEvent' then 1 else 0 end) [Click],

Related

SQL select column group by where the ratio of a value is 1

I am using PSQL.
I have a table with a few columns, one column is event that can have 4 different values - X1, X2, Y1, Y2. I have another column that is the name of the service and I want to group by using this column.
My goal is to make a query that take an event and verify that for a specific service name I have count(X1) == count(X2) if not display a new column with "error"
Is this even possible? I am kinda new to SQL and not sure how to write this.
So far I tried something like this
select
service_name, event, count(service_name)
from
service_table st
group by
(service_name, event);
I am getting the count of each event for specific service_name but I would like to verify that count of event 1 == count of event 2 for each service_name.
I want to add that each service_name have a choice of 2 different event only.

You may not need a subquery/CTE for this, but it will work (and makes the logic easier to follow):
WITH event_counts_by_service AS (SELECT
service_name
, COUNT(CASE WHEN event='X1' THEN 1 END) AS count_x1
, COUNT(CASE WHEN event='X2' THEN 1 END) AS count_x2
FROM service_table
GROUP BY service_name)
SELECT service_name
, CASE WHEN count_x1=count_x2 THEN NULL ELSE 'Error' END AS are_counts_equal
FROM event_counts_by_service

Computing conversion rate by counting TRUE/FALSE statements

Sorry if this sounds very basic; bear with me.
I need to determine the conversion rate of 3 ads, each representing a product; that would be subscription divided by the number of people who clicked the ad.
COLUMNS:
person_id - unique identifier of the person
date - date they were shown the ad
ad_id - content of the ad: ad_1_product1, ad_2_product2, or ad_3_product3
clicked (TRUE/FALSE) - clicked on the ad
signed_up - (TRUE/FALSE) created an account
subscribed (TRUE/FALSE) - started a paid subscription
I set clicked, signed_up and subscribed as boolean.
MY CODE:
SELECT ad_id, (count(subscribed) / count(clicked)) as CR
FROM videoadcampaign
WHERE subscribed = 'TRUE' AND clicked = 'TRUE'
GROUP BY ad_id;
Of course, the code above gives me a ratio of 1, because SQL is still counting the total and dividing by the same number because of those conditions.
I am totally stuck.
I will also need to calculate other KPIs for clicks and signed_up, so filtering those booleans and put them into a ratio is the core of what I need to do.
Is there a way I can tell SQL to compute CR = SUBS (TRUE) / SUBS (TRUE + FALSE) [or total count] and then filter by CLICK = TRUE?
Thank you tons for your help!

It depends on your database, but the general notion would be:
SELECT ad_id,
(SUM(CASE WHEN subscribed = 'TRUE' THEN 1.0 ELSE 0 END) /
SUM(CASE WHEN clicked = 'TRUE' THEN 1 ELSE 0 END)
) as CR
FROM videoadcampaign
GROUP BY ad_id;
In many databases, you can do something like this if the columns are integers (0 = false, 1 = true):
SELECT ad_id, SUM(subscribed) / SUM(clicked) as CR
FROM videoadcampaign
WHERE clicked = 'TRUE'
GROUP BY ad_id;
Or even:
SELECT ad_id, AVG(subscribed) as CR
FROM videoadcampaign
WHERE clicked = 'TRUE'
GROUP BY ad_id;

Find duplicates using parcel number and updated their status. SQL Server 2014

I have a problem with duplicate records in a SQL Server 2014 database.
Users get a small postcard with a parcel number printed on them.
The postcard also shows a link to a simple form that they can use, to register their parcel.
The form unfortunately does not have any type of validation, to ensure that the same parcel does not get submitted more than once.
I currently have no control on the web form, and I am not sure how long will take for the responsible team to implement validation on it.
So I have to come up with a routine to deactivate the duplicate records, and keep only one.
This has to be a query that process a bulk of records, no tokens passed to the routine.
When the web form gets submitted, it creates a record id in sequential order, and assigns an application status of "Registered'.
I think that the way to correct this, would be to take highest record id value per parcel, and that would be the one to keep, the rest, will have to be deactivated.
Deactivate the non most recent records putting a rec_status of "I"
Set APPLICATION_STATUS to 'Closed' to the non most recent records
The query I use, returns 4 columns: Record Id, Parcel Number, Record Status, and Application Status
SELECT
B.[RECORD_ID],
B.[PARCEL_NBR],
B.[RECORD_STATUS], -- The value of this column would be "I" for the duplicate records.
B.[APPLICATION_STATUS]
FROM
A_TABLE A
INNER JOIN B_TABLE B
ON A.PARCEL_NBR = B.PARCEL_NBR
AND (A.APPLICATION_STATUS IS NULL
OR B.APPLICATION_STATUS = 'Registered');
Initial Output:
RECORD_ID PARCEL_NBR RECORD_STATUS APPLICATION_STATUS
REC-00081 0608012098 A Registered
REC-00082 0608012098 A Registered
REC-00083 0608012098 A Registered
Expected Output:
RECORD_ID PARCEL_NBR RECORD_STATUS APPLICATION_STATUS
REC-00081 0608012098 I Closed - this record got updated
REC-00082 0608012098 I Closed - this record got updated
REC-00083 0608012098 A Registered
I think that perhaps a cursor might be part of the solution? Honestly I am not sure. I kindly ask for your help.

You can use window functions and case logic:
SELECT B.[RECORD_ID], B.[PARCEL_NBR],
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY B.PARCEL_NBR ORDER BY B.RECORD_ID DESC) > 1
THEN 'I' ELSE B.[RECORD_STATUS]
END) as RECORD_STATUS,
(CASE WHEN ROW_NUMBER() OVER (PARTITION BY B.PARCEL_NBR ORDER BY B.RECORD_ID DESC) > 1
THEN Closed - this record got updated ELSE B.APPLICATION_STATUS
END) as APPLICATION_STATUS,
B.[]
FROM A_TABLE A JOIN
B_TABLE B
ON A.PARCEL_NBR = B.PARCEL_NBR AND
(A.APPLICATION_STATUS IS NULL OR B.APPLICATION_STATUS = 'Registered');

I'm not sure what role A_TABLE plays in this, but this may give you what you want:
update B_TABLE
set record_Status = 'I'
, application_status = 'Closed - this record got updated'
where record_status = 'A'
and application_status = 'Registered'
and record_id <> (select max(record_id)
from B_TABLE b
where b.parcel_nbr = B_TABLE.parcel_nbr
and b.record_status = 'A'
and b.application_status = 'Registered');

Comparing "queries" and then saving the result for new select statement

Iam trying to make a query that will check if a user is logged in or not.
The data is stored as 2 seperate rows one is called "in", when a users logs in and the other "out". I then need to find all people currently logged in but not logged out. so what ive tried is comparing the two select statement. This gives me the name(UNILOGIN) of all the people currently logged in and not out:
select UNILOGIN from timereg where date = CONVERT(DATE,GETDATE(),110) and CHECKEDIN = 'IND'
except
select UNILOGIN from timereg where date = CONVERT(DATE,GETDATE(),110) and CHECKEDIN = 'UD'
I then need to find their top 1 time, when they checked in. How would one make a statement that could get result in in one query string? If possible at all. something like:
SELECT TOP 1 UNILOGIN, TIME from TIMEREG where UNILOGIN = "result of query"
Tell me if i need to elaborate.

Aggregate. Get one result row per unilogin, make sure it has an 'IND' record and no 'UD' record and select the maximum login time for the date in question.
select unilogin, max(time)
from timereg
where date = convert(date, getdate(), 110)
and checkedin in ('IND', 'UD')
group by unilogin
having count(case when checkedin = 'IND' then 1 end) > 0
and count(case when checkedin = 'UD' then 1 end) = 0;

multiple count(distinct)

I get an error unless I remove one of the count(distinct ...). Can someone tell me why and how to fix it?
I'm in vfp. iif([condition],[if true],[else]) is equivalent to case when
SELECT * FROM dpgift where !nocalc AND rectype = "G" AND sol = "EM112" INTO CURSOR cGift
SELECT
list_code,
count(distinct iif(language != 'F' AND renew = '0' AND type = 'IN',donor,0)) as d_Count_E_New_Indiv,
count(distinct iif(language = 'F' AND renew = '0' AND type = 'IN',donor,0)) as d_Count_F_New_Indiv /*it works if i remove this*/
FROM cGift gift
LEFT JOIN
(select didnumb, language, type from dp) d
on cast(gift.donor as i) = cast(d.didnumb as i)
GROUP BY list_code
ORDER by list_code
edit:
apparently, you can't use multiple distinct commands on the same level. Any way around this?

VFP does NOT support two "DISTINCT" clauses in the same query... PERIOD... I've even tested on a simple table of my own, DIRECTLY from within VFP such as
select count( distinct Col1 ) as Cnt1, count( distinct col2 ) as Cnt2 from MyTable
causes a crash. I don't know why you are trying to do DISTINCT as you are just testing a condition... I more accurately appears you just want a COUNT of entries per each category of criteria instead of actually DISTINCT
Because you are not "alias.field" referencing your columns in your query, I don't know which column is the basis of what. However, to help handle your DISTINCT, and it appears you are running from WITHIN a VFP app as you are using the "INTO CURSOR" clause (which would not be associated with any OleDB .net development), I would pre-query and group those criteria, something like...
select list_code,
donor,
max( iif( language != 'F' and renew = '0' and type = 'IN', 1, 0 )) as EQualified,
max( iif( language = 'F' and renew = '0' and type = 'IN', 1, 0 )) as FQualified
from
list_code
group by
list_code,
donor
into
cursor cGroupedByDonor
so the above will ONLY get a count of 1 per donor per list code, no matter how many records that qualify. In addition, if one record as an "F" and another does NOT, then you'll have a value of 1 in EACH of the columns... Then you can do something like...
select
list_code,
sum( EQualified ) as DistEQualified,
sum( FQualified ) as DistFQualified
from
cGroupedByDonor
group by
list_code
into
cursor cDistinctByListCode
then run from that...

You can try using either another derived table or two to do the calculations you need, or using projections (queries in the field list). Without seeing the schema, it's hard to know which one will work for you.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas