Sqlite Get counts of all distinct values across a row - sql

For a personal end of the year project I've scraped my attendance off the school website hoping to do some form of visualization of the data. I've now gotten stuck transforming that data into the form I need it in.
Currently my database looks like this
Date,One,Two,Three,Four,Five,Six,Seven,Eight,Nine,Dee
2014-09-03,P,P,P,P,AU,AU,P,T*,AU,P
2014-09-04,P,P,P,P,N/A,AU,P,T*,N/A,P
2014-09-05,P,P,P,P,AU,AU,P,P,P,P
2014-09-09,P,P,P,P,AU,AU,P,P,AU,P
2014-09-11,AU,AU,P,AU,AU,P,AU,AU,AU,P
2014-09-15,P,P,P,P,AU,P,P,P,AU,P
2014-09-17,P,P,P,P,AU,AU,P,P,AU,P
The columns are each period,and each one has an indicator of my presence. My question is, is it possible to turn that into something like this using only sqlite?
Date,P,AU,T*,N/A
2014-09-03,6,3,1,0
2014-09-04,6,1,1,2
2014-09-05,8,2,0,0
2014-09-09,7,3,0,0
2014-09-11,3,7,0,0
2014-09-15,8,2,0,0
2014-09-17,7,3,0,0
2014-09-19,9,1,0,0
Counting each occurence of a value across the row.

Something like this:
select date,
case when one = 'p' then 1 else 0 end +
case when two = 'p' then 1 else 0 end +
...
case when dee = 'p' then 1 else 0 end as p,
case when one = 'au' then 1 else 0 end +
case when two = 'au' then 1 else 0 end +
...
case when dee = 'au' then 1 else 0 end as au,
...
from table

Related

Applying Logic to Grouped SQL Results

Looking for some Oracle SQL theoretical help on the best way to handle a grouped result set. I understand why it groups the way it does, but I'm trying to figure out if there's a way to
I have a table that lists the activity of some cost centers. It looks like this:
Company Object Sub July August
A 1 20 50
A 1 10 0
A 1 10 0 20
B 1 0 0
I then need to flag whether or not there was activity in August. So I'm writing a CASE statement where if August = 0 THEN 'FALSE' ELSE 'TRUE'. Then I need to group all records by Company, Object, and Sub. The Cumulative column is a SUM of both July and August. However, my output looks like this:
Company Object Sub SUM ActivityFlag
A 1 70 TRUE
A 1 10 FALSE
A 1 10 20 TRUE
B 1 0 FALSE
What I need is this:
Company Object Sub August ActivityFlag
A 1 80 TRUE
A 1 10 20 TRUE
B 1 0 FALSE
Obviously, this is a simplified example of a much larger issue, but I'm trying to think through this problem theoretically so I can apply similar logic to my actual issue.
Is there a good SQL method for adding the August amount for rows 1 and 2, and then selecting TRUE so that this appears on a single row? I hope this makes sense.
use aggregation
select company,object,sub,sum(july+august),
max(case when august>0 then 'True' else 'false' end)
from table_name group by company,object,sub
If you are flagging your detail with the case statement you can either put the case in a sum similar to:
MAX(CASE WHEN August = 0 THEN 1 ELSE 0 END)
Another way if to aggregate the flag upward in an inner query:
SELECT IsAugust = MAX(IsAugust) FROM
(
...
IsAugust = CASE WHEN August=0 THEN 1 ELSE 0 END
...
)AS X
GROUP BY...

Troubleshooting Errors with Two SUMs

I have a table, it's going to be used for a supplier scorecard, with eleven different fields that can be assigned a value of 1-5. Null values are allowed.
I need to write a query that will calculate the average of the fields that are filled out by each row. In other words, I might be dividing TOTAL by 11 in one row, and dividing TOTAL by 5 in another.
I'm working with this query:
select
cf$_vendor_no,
cf$_party,
cf$_environmental,
cf$_inspections,
cf$_invoice_process,
cf$_ncr,
cf$_on_time_delivery,
cf$_qms,
cf$_safety,
cf$_schedule,
cf$_scope_of_work,
cf$_turn_times,
sum(nvl(cf$_environmental,0)
+nvl(cf$_inspections,0)
+nvl(cf$_invoice_process,0)
+nvl(cf$_ncr,0)
+nvl(cf$_on_time_delivery,0)
+nvl(cf$_qms,0)
+nvl(cf$_safety,0)
+nvl(cf$_schedule,0)
+nvl(cf$_scope_of_work,0)
+nvl(cf$_turn_times,0))
/
sum(
case when cf$_environmental is not null then 1 else 0 end +
case when cf$_inspections is not null then 1 else 0 end +
case when cf$_invoice_process is not null then 1 else 0 end +
case when cf$_ncr is not null then 1 else 0 end +
case when cf$_on_time_delivery is not null then 1 else 0 end +
case when cf$_qms is not null then 1 else 0 end +
case when cf$_safety is not null then 1 else 0 end +
case when cf$_schedule is not null then 1 else 0 end +
case when cf$_scope_of_work is not null then 1 else 0 end +
case when cf$_turn_times is not null then 1 else 0 end) --as "average"
from supplier_scorecard_clv
group by cf$_vendor_no, cf$_party, cf$_environmental, cf$_inspections, cf$_invoice_process, cf$_ncr, cf$_on_time_delivery, cf$_qms, cf$_safety, cf$_schedule, cf$_scope_of_work, cf$_turn_times
And, it almost works.
The first SUM in my code will add the values in each row to give me a total. I get a total 25 for the first FARW002 row, I get 6 for the second, and 12 for the third.
The second SUM in my code works as well. I get a count of 6 for my first FARW002 row, 2 for my second, and 3 for my third.
However, when I try to combine these, like in the code snippet above, I get a "ORA-00923: FROM keyword not found where expected" error and I'm not sure why.
So, this is stupid but here's what the problem ended up being:
+nvl(cf$_turn_times,0))
/
sum(
When I changed the code to this - really I was just dicking around - it worked:
+nvl(cf$_turn_times,0))/sum(
So, something about having the / and SUM separated from the rest of the query - which I only do to make the code more readable for me - was causing the issue.
Thanks for nothing Juan!

Counting how many data that exist [SQL]

im not sure about this question is already asked by anyone else or not yet because this is actually easy but my head is just still can't see the way out of this problem.
this is just like how many times that we do sampling at the material.
SELECT
TABLE01.MATERIAL_NO,
TABLE01.Sample_Tempt1,
TABLE01.Sample_Tempt2,
TABLE01.Sample_Tempt3,
TABLE01.Sample_Tempt4,
TABLE01.Sample_Tempt5
FROM
TABLE01
is it possible to create another column to show count of sample_tempt times?
i mean, if the tempt1 tempt2 data are exist, the column shows 2, when tempt2, tempt4 and tempt5 data are exist, the column show 3. and so on.
Thank you for helping me ^^
Sample :
Material no | Sample_Tempt1 | Sample_Tempt2 | Sample_Tempt3 | Sample_Tempt4 | Sample_Tempt5 |
PO1025 120 150 102
PO1026 122
For the PO1025, i want to create new column that generate "3" because the sample data that exist is only 3, for the PO1026 i want it generate "1" since the sample data that exist is only "1". quite simple right?
If "by exist" you mean "value is not NULL", then you can count the number of non-NULL values in each row as:
SELECT t1.MATERIAL_NO,
t1.Sample_Tempt1, t1.Sample_Tempt2, t1.Sample_Tempt3, t1.Sample_Tempt4, t1.Sample_Tempt5,
((case when t1.sample_temp1 is not null then 1 else 0 end) +
(case when t1.sample_temp2 is not null then 1 else 0 end) +
(case when t1.sample_temp3 is not null then 1 else 0 end) +
(case when t1.sample_temp4 is not null then 1 else 0 end) +
(case when t1.sample_temp5 is not null then 1 else 0 end)
) as NumTempts
FROM TABLE01 t1;
Note that I introduced a table alias. This makes the query easier to write and to read.

case compare to number and give out name

the problem I have is that I have a column called cate with the numbers 1 to 5 but I want alias names in the print out.
For example if the column has the number 1 I want STONE in the result set, if it is 2 I want "TREE".
I should look something like
Select
case when t.cate = 1 then t.cate="STONE"
case when t.cate = 2 then t.cate="TREE"
else null end as test from dbt.tbl t
I do not want to change the value in the table only in the print out.
Any idea how I can that to work?
Thanks for all your help in advance
remove extra case,
SELECT CASE WHEN t.cate = 1 THEN 'STONE'
WHEN t.cate = 2 THEN 'TREE'
ELSE null
END AS test
FROM dbt.tbl t
Alternatively, you can write
SELECT
CASE t.cate
WHEN 1 THEN 'STONE'
WHEN 2 THEN 'TREE'
ELSE NULL
END AS test
FROM dbt.tbl t
If the list is likely to change in the future (either through edits or additions), I'd do it as a separate table:
INSERT INTO Cates (Cate,Description) VALUES
(1,'Stone'),
(2,'Tree') --Etc
And then just do:
SELECT c.Description as Test
FROM dbt.tbl t inner join Cates c on t.Cate = c.Cate

mysql: Average over multiple columns in one row, ignoring nulls

I have a large table (of sites) with several numeric columns - say a through f. (These are site rankings from different organizations, like alexa, google, quantcast, etc. Each has a different range and format; they're straight dumps from the outside DBs.)
For many of the records, one or more of these columns is null, because the outside DB doesn't have data for it. They all cover different subsets of my DB.
I want column t to be their weighted average (each of a..f have static weights which I assign), ignoring null values (which can occur in any of them), except being null if they're all null.
I would prefer to do this with a simple SQL calculation, rather than doing it in app code or using some huge ugly nested if block to handle every permutation of nulls. (Given that I have an increasing number of columns to average over as I add in more outside DB sources, this would be exponentially more ugly and bug-prone.)
I'd use AVG but that's only for group by, and this is w/in one record. The data is semantically nullable, and I don't want to average in some "average" value in place of the nulls; I want to only be counting the columns for which data is there.
Is there a good way to do this?
Ideally, what I want is something like UPDATE sites SET t = AVG(a*#a_weight,b*#b_weight,...) where any null values are just ignored and no grouping is happening.
EDIT: What I ended up using, based on van's and adding in correct weighted averages (assuming that a has already been normalized as needed, in this case to a float 0-1 (1 = better):
UPDATE sites
SET t = (#a_weight * IFNULL(a, 0) + ...) / (IF(a IS NULL, 0, #a_weight) + ...)
WHERE (IF(a IS NULL, 0, 1) + ...) > 0
UPDATE sites
--// TODO: you might need to round it depending on your type
SET t =(COALESCE(a, 0) +
COALESCE(b, 0) +
COALESCE(c, 0) +
COALESCE(d, 0) +
COALESCE(e, 0) +
COALESCE(f, 0)
) /
((CASE WHEN a IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN b IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN c IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN d IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN e IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN f IS NULL THEN 0 ELSE 1 END CASE)
)
WHERE 0<>((CASE WHEN a IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN b IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN c IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN d IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN e IS NULL THEN 0 ELSE 1 END CASE) +
(CASE WHEN f IS NULL THEN 0 ELSE 1 END CASE)
)
You could use COALESCE also in the other parts, but this will not handle the case when you have a rating with value 0 properly because it will be excluded. The WHERE clause avoids DivideByZero, but you might need to have additional UPDATE statement to handle this case, if there is no rating for the entry.