I have a list of multiple text criteria that I need to use to count occurrences of matching text in the database with the LIKE operator.
How can I write a query that will run the comparison with all criteria at once and return a summarized list of counts per each of these criteria?
I've started with a simple query on which I'd like to iterate using a while loop. But unfortunately, I don't have the right to create a table that would contain the criteria to be referenced with each loop so I have to prepare query for each criteria separately.
SELECT sum(X), sum(Y), Z
FROM table
WHERE text LIKE 'criteria sample'
WHERE A = nnnnnn
AND B > 0
Group by Z
To avoid running and writing multiple queries, how can I load all criteria into one query to run them against the content to get the results faster out?
You can construct a table in the query with your values:
SELECT v.comparison, sum(t.X), sum(t.Y), t.Z
FROM table t JOIN
(VALUES ('criteria sample'),
('criteria sample 2'),
. . . -- whatever the values are
) v(comparison)
ON t.text LIKE v.comparison
WHERE t.A = nnnnnn AND t.B > 0
GROUP BY v.comparison, t.Z;
If you wanted every matching row, you could simply say WHERE text LIKE '%foo%' OR text LIKE '%bar%'. However, if I understand your question, you actually want a count of rows matching '%foo%' and a separate count of rows matching '%bar%'. This implies that the SUMs and GROUP BY have to be run separately for each criteria text.
One possible solution is to write all the queries separately, then union them at the end. Note that union requires every column to have a name. You'll also want a static identifier string in each subquery so you can tell which criteria was met. So you'll have something like:
SELECT sum(X) AS sumx, sum(Y) AS sumy, Z, 'foo' AS criteria FROM table WHERE text LIKE '%foo%' AND A = nnnnnn AND B > 0 Group by Z
UNION
SELECT sum(X) AS sumx, sum(Y) AS sumy, Z, 'bar' AS criteria FROM table WHERE text LIKE '%bar%' AND A = nnnnnn AND B > 0 Group by Z
UNION
...(etc)...
I think it should just work like this, if I'm understanding correctly.
select t.z,
sum(case when text like '%firstcriteria%' then 1 else 0 end) as countfirstcriteria,
sum(case when text like '%secondcriteria%' then 1 else 0 end) as countsecondcriteria
from table t
where A = 'nnnnnn'
AND B > 0
group by t.z
Like should follow with % symbol so that the word you are looking for is captured even if there a words before or after it. I also removed the redundant WHERE
SELECT sum(X), sum(Y), Z
FROM table
WHERE text LIKE '%criteria sample%'
AND A = 'nnnnnn'
AND B > 0
Group by Z
Related
I have a query that's being use in my application. I had to make a small edit to it (select an additional column) and now that I do that, I don't get the same results, therefore I get a bad file. Just to give example this is what the query looks like ....
Select
'X' = tblA.VendorNumber,
'Y' = tblB.Label,
'Z' = tblC.InvoiceNo,
'W' = tblD.Checks,
From //Doing some joins here
Group By
tblA.VendorNumber, tblB.label, tblc.InvoiceNo, tblD.Checks
The result set gives me many records, but groups by the ones with identical X,Y,Z,W - so with no Group By it would look like this
X Y Z W
-----------------------------------------------------------
123 Anton 772 0
123 Anton 772 0
Obviously, with the group by they are rolled up into one...
The issue comes when I try to include an additional column in my Select query. I need this query in my, because I need to value in my code to be able to distinguish what type of record it is. With the new column, these two rows of data are not the same, therefore they do not get rolled up.
Is there a way for me to somehow add an additional column, but not display it, and exclude it from the Group By?
This is what I mean
Select
'X' = tblA.VendorNumber,
'Y' = tblB.Label,
'Z' = tblC.InvoiceNo,
'W' = tblD.Checks,
'P' = tblC.Proc -- New column
From //Doing some joins here
Group By
tblA.VendorNumber, tblB.label, tblc.InvoiceNo, tblD.Checks,
tblC.Proc -- New column
In this case the data looks like this
X Y Z W P
---------------------------------------------------------------------
123 Anton 772 0 FPN
123 Anton 772 0 PPN
So now that P is different for the 2 records that previously were rolled up into one, is there a way for me to somehow not display P, however, still be able to get it's value from my record set. I am unable to select the 'P' if it's not selected in this one query and because of the fact that the two records are not rolling up, I'm having some major issues.
Basically I need to select 'P' but not include it in my result set or group by.
Any help would be much appreciated.
There's a couple of options, I guess... There's not really a way to get values without having them in your results. How else would you read them?
One would be to STUFF the values of 'P' into a comma delimited list ie:
X Y Z W P
123 Anton 772 0 FPN,PPN
Then read them in your application separated by comma. Read here: https://stackoverflow.com/questions/31211506/how-stuff-and-for-xml-path-work-in-sql-server
Another would be to create boolean headers if there's not too many options for 'P' or you know all of the options. You can create them using CASE statements like:
,SUM(CASE WHEN tlbc.Proc = 'PPN' THEN 1 ELSE 0 END) AS "PPN"
EDIT: Left out an aggregate around it so it groups correctly. Can use MAX, for 1 or 0, as well... depends how many results there can be for tlbc.Proc. Some aggregate function around your cases will combine your rows to one.
For results like:
X Y Z W FPN PPN AnotherP
123 Anton 772 0 1 1 0
Third, if I misread your question and you don't need the values but just need them in a WHERE, then don't display them.
There's definitely more ways to go about this.
Does this help?
You need to do some sort of aggregate function on P so you don't need to group on it. If you only care about one of the values, you could use MIN() or MAX() to get one of them.
So, something like:
SELECT X, Y, Z, MAX(P) AS P etc....
Another way would be to use something like FOR XML PATH to pivot the values into a single value (maybe a comma separated list). There are lots of examples online of people doing this. Here is one example:
https://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation
If you only want one record per first DISTINCT four column, which one of the P column you want to show?
If you have an answer for this question, just add to the query
Where tblC.Proc = -->YOUR_EXPECTED_VALUE_OR_CONDITION_HERE<--
I am quite new to SQL and I am currently working on some survey results with PostgreSQL. I need to calculate percentages of each option from 5-point scale for all survey questions. I have a table with respondentid, questionid, question response value. Demographic info needed for filtering datacut is retrieved from another table. Then query is passed to result table. All queries texts for specific datacuts are generated by VBA script.
It works OK in general, however there's one problematic case - when there are no respondents for specific cut and I receive empty table as query result. If respondent count is greater than 0 but lower than calculation threshold (5 respondents) I am getting table full of NULLs which is OK. For 0 respondents I get 0 rows as result and nothing is passed to result table and it causes some displacement in final table. I am able to track such cuts as I am also calculating respondent number for overall datacut and storing it in another table. But is there anything I can do at this point - generate somehow table full of NULLs which could be inserted into result table when needed?
Thanks in advance and sorry for clumsiness in code.
WITH ItemScores AS (
SELECT
rsp.questionid,
CASE WHEN SUM(CASE WHEN rsp.respvalue >= 0 THEN 1 ELSE 0 END) < 5 THEN
NULL
ELSE
ROUND(SUM(CASE WHEN rsp.respvalue = 5 THEN 1 ELSE 0 END)/CAST(SUM(CASE
WHEN rsp.respvalue >= 0 THEN 1 ELSE 0 END) AS DECIMAL),2)
END AS 5spercentage,
... and so on for frequencies of 1s,2s,3s and 4s
SUM(CASE WHEN rsp.respvalue >= 0 THEN 1 ELSE 0 END) AS QuestionTotalAnswers
FROM (
some filtering applied here [...]
) AS rsp
GROUP BY rsp.questionid
ORDER BY rsp.questionid;
INSERT INTO results_items SELECT * from ItemScores;
If you want to ensure that the questionid column won't be empty, then you must call a cte with its plain values and then left join with the table that actually you are using to make the aggregations, calcs etc. So it will generate for sure the first list and then join its values.
The example of its concept would be something like:
with calcs as (
select questionid, sum(respvalue) as sum_per_question
from rsp
group by questionid)
select distinct rsp.questionid, calcs.sum_per_question
from rsp
left join calcs on rsp.questionid = calcs.questionid
I can do this in SQL Server:
SELECT 'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
(select sum(empid) from HR.employees) STOCK
but in Access the same query show me the next error:
Query input must contain at least one table or query
So which could be the best form to emulate this? Make a query with any other table looks dirty for me.
EDIT 1:, HR.employees It may no have data, but i want show constants ('HERRAMIENTA ELÉCTRICA',''0') and 0 in the third column, maybe using isnull and this is not the problem here.
Why not to select directly:
select 'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
IIF(ISNULL(sum(empid)), 0, sum(empid)) AS STOCK
from HR.employees
This simply doesn't work in Access. You need a FROM clause.
So you need to have a dummy table with one record, even if you don't use a single field from that table.
SELECT 'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
(select sum(empid) from HR.employees) STOCK
FROM Dummy_Table
Using this example as empty table:
with employ as
(select 2 as col from dual
minus
select 2 as col from dual)
The query is this one:
select 'HERRAM' as tipo,
0 as deprec,
coalesce(sum(col), 0) as STOCK
from employ;
coalesce(x, value) sets the column to value when X is null
In Access, you can use a system table, and Val and Nz for the zero value:
SELECT TOP 1
'HERRAMIENTA ELÉCTRICA' AS TIPO_PRODUCTO,
0 AS DEPRECIACION,
Val(Nz((select sum(empid) from HR.employees), 0)) AS STOCK
FROM
MSysObjects
I have two tables in Teradata: Table_A and Table_B. Between them is LEFT JOIN. Afterwards I am making SELECT statement which contains attributes from both tables:
SELECT
attribute_1
attribute_2
...
attribute_N
Afterwords, I am using SUM functions to do certain calculations. These functions look something like this:
SUM (
CASE WHEN Attribute_1 > 2 THEN attribute_2*1.2
ELSE 0
End
(in this example attributes in the select part are used).
But I also use in CASE part attributes which are not in the select statement - something liek this:
SUM (
CASE WHEN Attribute_X > 2 THEN attribute_Y*1.2
ELSE 0
End
Of course at the end I am doing GROUP BY 1,2,...,N
The error I am getting is "Selected non-aggregate values must be part of the associated group."
Furtheremore, I have checked billion times the number of the selected attributes in the SELECT part, and it is N.
The question is - why am I getting this error? Is it because I am using in the SUM part i.e. CASE part attributes (attribute_X and attribute_Y) which are not included in the SELECT part?
Blueprint of the end-statement looks sthg. like this:
INSERT INTO table_new
SELECT
attribute_1,
attribute_2,
...
attribute_N,
SUM (
CASE WHEN Attribute_1 > 2 THEN attribute_2*1.2
ELSE 0
End
) as sum_a,
SUM (
CASE WHEN Attribute_X > 2 THEN attribute_Y*1.2
ELSE 0
End
) as sum_X
FROM table_a LEFT JOIN table_B
ON ...
GROUP BY 1,2,...,N
The error message suggests that you have not included all the non-aggregate columns listed in your SELECT statement in your GROUP BY expression. I'm guessing that you have more columns listed than you have "place holders".
The best way to avoid this is to explicitly name all the columns and not use the "relative positioning" syntax. In other words, rather than using GROUP BY 1,2,...N use:
GROUP BY
attribute_1,
attribute_2,
...
attribute_N
If that does not fix your problem, modify your question and show a complete query that is not working.
I have found the error - the SUM part was composed of more sub-parts. For example "amount - SUM(...) + SUM(...)". I had ti include the attributes in the "amount" part.
I want to write a query that returns 3 results followed by blank results followed by the next 3 results, and so on. So if my database had this data:
CREATE TABLE table (a integer, b integer, c integer, d integer);
INSERT INTO table (a,b,c,d)
VALUES (1,2,3,4),
(5,6,7,8),
(9,10,11,12),
(13,14,15,16),
(17,18,19,20),
(21,22,23,24),
(25,26,37,28);
I would want my query to return this
1,2,3,4
5,6,7,8
9,10,11,12
, , ,
13,14,15,16
17,18,19,20
21,22,23,24
, , ,
25,26,27,28
I need this to work for arbitrarily many entries that I select for, have three be grouped together like this.
I'm running postgresql 8.3
This should work flawlessly in PostgreSQL 8.3
SELECT a, b, c, d
FROM (
SELECT rn, 0 AS rk, (x[rn]).*
FROM (
SELECT x, generate_series(1, array_upper(x, 1)) AS rn
FROM (SELECT ARRAY(SELECT tbl FROM tbl) AS x) x
) y
UNION ALL
SELECT generate_series(3, (SELECT count(*) FROM tbl), 3), 1, (NULL::tbl).*
ORDER BY rn, rk
) z
Major points
Works for a query that selects all columns of tbl.
Works for any table.
For selecting arbitrary columns you have to substitute (NULL::tbl).* with a matching number of NULL columns in the second query.
Assuming that NULL values are ok for "blank" rows.
If not, you'll have to cast your columns to text in the first and substitute '' for NULL in the second SELECT.
Query will be slow with very big tables.
If I had to do it, I would write a plpgsql function that loops through the results and inserts the blank rows. But you mentioned you had no direct access to the db ...
In short, no, there's not an easy way to do this, and generally, you shouldn't try. The database is concerned with what your data actually is, not how it's going to be displayed. It's not an appropriate scope of responsibility to expect your database to return "dummy" or "extra" data so that some down-stream process produces a desired output. The generating script needs to do that.
As you can't change your down-stream process, you could (read that with a significant degree of skepticism and disdain) add things like this:
Select Top 3
a, b, c, d
From
table
Union Select Top 1
'', '', '', ''
From
table
Union Select Top 3 Skip 3
a, b, c, d
From
table
Please, don't actually try do that.
You can do it (at least on DB2 - there doesn't appear to be equivalent functionality for your version of PostgreSQL).
No looping needed, although there is a bit of trickery involved...
Please note that though this works, it's really best to change your display code.
Statement requires CTEs (although that can be re-written to use other table references), and OLAP functions (I guess you could re-write it to count() previous rows in a subquery, but...).
WITH dataList (rowNum, dataColumn) as (SELECT CAST(CAST(:interval as REAL) /
(:interval - 1) * ROW_NUMBER() OVER(ORDER BY dataColumn) as INTEGER),
dataColumn
FROM dataTable),
blankIncluder(rowNum, dataColumn) as (SELECT rowNum, dataColumn
FROM dataList
UNION ALL
SELECT rowNum - 1, :blankDataColumn
FROM dataList
WHERE MOD(rowNum - 1, :interval) = 0
AND rowNum > :interval)
SELECT *
FROM dataList
ORDER BY rowNum
This will generate a list of those elements from the datatable, with a 'blank' line every interval lines, as ordered by the initial query. The result set only has 'blank' lines between existing lines - there are no 'blank' lines on the ends.