I am trying to fetch the column value after first comma using sql query. I have written the query but It is not optimized.
I want the data in output after the first comma in Address field.
Data
ID Address
1 Add1,Add2,Add3
2 Add1,Add2
Expected Output
ID Address
1 Add2,Add3
2 Add2
Query
select right(Address, len(Address) - charindex(',', reverse(Address)))
Can anyone help to write the optimized query for above table and output.
If you want to just divide the address after first comma you can use the below query.
SELECT ID,SUBSTRING(Address , CHARINDEX(',' , Address) + 1 , LEN(Address)) FROM Data;
If you want to divide it based on the number of addresses then this query is more suitable
WITH DivideName
AS
(
SELECT ID,name,(CASE WHEN CHARINDEX(',',Address)>0 THEN SUBSTRING(Address,1,CHARINDEX(',',Address)-1) ELSE Address END) part, (CASE WHEN CHARINDEX(',',Address)!=0 THEN SUBSTRING(Address,CHARINDEX(',',Address)+1,LEN(Address)) ELSE Address END) Rempart, 1 countno
FROM author
UNION ALL
SELECT ID,Address,(CASE WHEN CHARINDEX(',',Rempart)>0 THEN SUBSTRING(Rempart,1,CHARINDEX(',',Rempart)-1) ELSE Rempart END) part, (CASE WHEN CHARINDEX(',',Rempart)!=0 THEN SUBSTRING(Rempart,CHARINDEX(',',Rempart)+1,LEN(Rempart)) ELSE Rempart END) Rempart, countno + 1 FROM DivideName WHERE countno<(LEN(Address)-LEN(REPLACE(Address,',',''))+1)
)
SELECT Address, part Word, countno WordNo FROM DivideName ORDER BY ID,countno
Related
I was playing with usa_names dataset on Bigquery and in order to be able to visualize the top 10 names between 1910 and 2020, I had to GROUP BY year and create a new column for each of the 10 names using CASE.
The thing is, I will like to visualize the top 100 and I want to know if there is a way to automate the CASE, in the sense that I don't have to write a "WHEN and THEN Clause for each name in order to create a column for them.
I had to use the following SQL query code to first get the top 10 names;
SELECT
name,
SUM(number) AS total
FROM
bigquery-public-data.usa_names.usa_1910_current
WHERE
year BETWEEN 1910 AND 2020
GROUP BY
name
ORDER BY
total DESC
LIMIT
10
And then use the following code to convert each name row to columns;
SELECT
year,
SUM(CASE WHEN name = 'James' THEN number ELSE 0 END) AS James,
SUM(CASE WHEN name = 'John' THEN number ELSE 0 END) AS John,
SUM(CASE WHEN name = 'Robert' THEN number ELSE 0 END) AS Robert,
SUM(CASE WHEN name = 'Michael' THEN number ELSE 0 END) AS Michael,
SUM(CASE WHEN name = 'William' THEN number ELSE 0 END) AS William,
SUM(CASE WHEN name = 'Mary' THEN number ELSE 0 END) AS Mary,
SUM(CASE WHEN name = 'Richard' THEN number ELSE 0 END) AS Richard,
SUM(CASE WHEN name = 'Joseph' THEN number ELSE 0 END) AS Joseph,
SUM(CASE WHEN name = 'Charles' THEN number ELSE 0 END) AS Charles,
SUM(CASE WHEN name = 'Thomas' THEN number ELSE 0 END) AS Thomas
FROM
bigquery-public-data.usa_names.usa_1910_current
GROUP BY
year
ORDER BY
year
I want to achieve the same result without having to first pull out the name and manually enter them into the CASE statements.
Also, this won't be needed if there is a way to visualize the data directly without having to convert the names from row to columns.
Thanks.
You need to combine 2 capabilities:
row to column: PIVOT clause
scripting to automate the query finding the top 10 names
declare top_names default ((
select concat("'", string_agg(name, "','"), "'")
from (
// your query in question
SELECT
name
FROM
bigquery-public-data.usa_names.usa_1910_current
WHERE
year BETWEEN 1910 AND 2020
GROUP BY
name
ORDER BY
SUM(number) DESC
LIMIT
10
)));
select top_names;
The output is:
'James','John','Robert','Michael','William','Mary','David','Richard','Joseph','Charles'
The PIVOT query you will need is:
SELECT * FROM
(select year, name, sum(number) number
from bigquery-public-data.usa_names.usa_1910_current
group by year, name
)
PIVOT(SUM(number) FOR name IN ('James','John','Robert','Michael','William','Mary','David','Richard','Joseph','Charles'
))
which output exactly as your second query.
To stick the 2 together, you will need something like:
execute immediate concat(
"""
SELECT * FROM
(select year, name, sum(number) number
from bigquery-public-data.usa_names.usa_1910_current
group by year, name
)
PIVOT(SUM(number) FOR name IN (
""",
top_names,
"))");
You shouldn't need to create a column for each name. Your first query is sufficient (would obviously just need to change the limit to 100). Based on the questions tags I'm assuming your using Tableau, so it would be as simple as choosing your desired visualisation (say a bar chart) and placing names on one axis and total on the other axis.
Based on your follow up comment it would look like this
SELECT
name,
year,
SUM(number) AS total
From bigquery-public-data.usa_names.usa_1910_current
WHERE name IN
(
SELECT name
FROM
(
SELECT
name,
SUM(number) AS total
FROM
bigquery-public-data.usa_names.usa_1910_current
WHERE
year BETWEEN 1910 AND 2020
GROUP BY
name
ORDER BY
total DESC
LIMIT
100
))
GROUP BY name, year
You could also look into using calculate fields within Tableau ok the raw data to achieve the desired visualisation.
I have below tables
Table1: "Demo"
Columns: SSN, sales, Create_DT,Update_Dt
Table2: "Agent"
Columns: SSN,sales, Agent_Name, Create_Dt, Update_DT
Scenario 1 and desired result set:
I want output as 0 if the count of SSN in Demo table is matched with the count of SSN in Agent table
if the count is not matched then I want result as 1
Scenario 2 and desired result set:
I want output as 0 if the sum of sales in Demo table is matched with the sum of sales in Agent table
if the sum is not matched then I want result as 1
Please help on this query part
Thanks
You can write two queries separately to take counts within the result query
SELECT (SELECT count(Demo.SSN) as SSN1 from Demo)!=(SELECT count(Agent.SSN) as SSN2 from Agent) AS Result;
Basically what the inner queries does is it checked whether the counts are equal or not and outputs 1 if it is true and 0 if it is false. Since you have asked to output 1 if it is false I used '!=' sign.
You can try the same procedure in scenario 2 also
For scenario 1
select (Case when (select count(ssn) from Demo)=(select count(ssn) from Agent) then 0 else 1 end) as desired_result
If you want to count unique ssn then:
select (Case when (select count(distinct ssn) from Demo)=(select count(distinct ssn) from Agent) then 0 else 1 end) as desired_result
For scenario 2:
select (Case when (select sum(sales) from Demo)=(select sum(sales) from Agent) then 0 else 1 end) as desired_result
I would suggest one query with both sets of information:
select (d.num_ssn <> a.num_ssn) as have_different_ssn_count,
(d.sales <> a.sales) as have_different_sales
from (select count(distinct ssn) as num_ssn,
coalesce(sum(sales), 0) as sales
from demo
) d cross join
(select count(distinct ssn) as num_ssn,
coalesce(sum(sales), 0) as sales
from agent
) a;
Note: This returns boolean values -- true/false rather than 1/0. If you really want 0/1, then use case:
select (case when d.num_ssn <> a.num_ssn then 1 else 0 end) as have_different_ssn_count,
(case when d.sales <> a.sales then 1 else 0 end) as have_different_sales
It would not surprise me if you were not only interested in the total counts but also that the agent/sales combinations are the same in both tables. If that is the case, please ask a new question with a clear explanation. Sample data and desired results help.
I have a list of words I need to find in a specific column , "description of what happenned "
this holds anything up to 500 or more characters. I have the script below that does work
However how do I replace the Name column 1.2.3 with the actual name of the word I am looking for with the total next to it.
Just cant get it to display prob something simple.
select GROUPING_ID ( Amoxicillin ,Atorvastatin ) as Name ,count(*) as Total
from ( select case when [description_of_what_happened] like '%Amoxicillin%'
then 1 else 0 end as Amoxicillin ,
case when [description_of_what_happened] like '%Atorvastatin%'
then 1 else 0 end as Atorvastatin
FROM "NAME OF TABLE"
group by grouping sets (() ,(Amoxicillin),(Atorvastatin))
having coalesce (Amoxicillin,1) != 0 and coalesce (Atorvastatin,1) != 0
order by grouping_id (Amoxicillin,Atorvastatin)
row 3 being the total I need row 1 and row 2 to show the name of the product
result as below
Name Total
1 7
2 9
3 4112
You can use strings instead of flags:
select coalesce(Amoxicillin, Atorvastatin, 'Total') as Name,
count(*) as Total
from (select (case when [description_of_what_happened] like '%Amoxicillin%'
then 'Amoxicillin'
end) as Amoxicillin ,
(case when [description_of_what_happened] like '%Atorvastatin%'
then 'Atorvastatin'
end
) as Atorvastatin
from "NAME OF TABLE"
where Amoxicillin is not null or Atorvastatin is not null
group by grouping sets ((), (Amoxicillin), (Atorvastatin))
order by name;
Note that I also moved the logic in the having to the where.
I have been building up a query today and I have got stuck. I have two unique Ids that identify if and order is Internal or Web. I have been able to split this out so it does the count of how many times they appear but unfortunately it is not providing me with the intended result. From research I have tried creating a Count Distinct Case When statement to provide me with the results.
Please see below where I have broken down what it is doing and how I expect it to be.
Original data looks like:
Company Name Order Date Order Items Orders Value REF
-------------------------------------------------------------------------------
CompanyA 03/01/2019 Item1 Order1 170 INT1
CompanyA 03/01/2019 Item2 Order1 0 INT1
CompanyA 03/01/2019 Item3 Order2 160 WEB2
CompanyA 03/01/2019 Item4 Order2 0 WEB2
How I expect it to be:
Company Name Order Date Order Items Orders Value WEB INT
-----------------------------------------------------------------------------------------
CompanyA 03/01/2019 4 2 330 1 1
What currently comes out
Company Name Order Date Order Items Orders Value WEB INT
-----------------------------------------------------------------------------------------
CompanyA 03/01/2019 4 2 330 2 2
As you can see from my current result it is counting every line even though it is the same reference. Now it is not a hard and fast rule that it is always doubled up. This is why I think I need a Count Distinct Case When. Below is my query I am currently using. This pull from a Progress V10 ODBC that I connect through Excel. Unfortunately I do not have SSMS and Microsoft Query is just useless.
My Current SQL:
SELECT
Company_0.CoaCompanyName
, SopOrder_0.SooOrderDate
, Count(DISTINCT SopOrder_0.SooOrderNumber) AS 'Orders'
, SUM(CASE WHEN SopOrder_0.SooOrderNumber IS NOT NULL THEN 1 ELSE 0 END) AS 'Order Items'
, SUM(SopOrderItem_0.SoiValue) AS 'Order Value'
, SUM(CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN 1 ELSE 0 END) AS 'INT'
, SUM(CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'WEB%' THEN 1 ELSE 0 END) AS 'WEB'
FROM
SBS.PUB.Company Company_0
, SBS.PUB.SopOrder SopOrder_0
, SBS.PUB.SopOrderItem SopOrderItem_0
WHERE
SopOrder_0.SopOrderID = SopOrderItem_0.SopOrderID
AND Company_0.CompanyID = SopOrder_0.CompanyID
AND SopOrder_0.SooOrderDate > '2019-01-01'
GROUP BY
Company_0.CoaCompanyName
, SopOrder_0.SooOrderDate
I have tried using the following line but it errors on me when importing:
, Count(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference ELSE 0 END) AS 'INT'
Just so know the error I get when importing at the moment is syntax error at or about "CASE WHEN sopOrder_0.SooParentOrderRefer" (10713)
Try removing the ELSE:
COUNT(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference END) AS num_int
You don't specify the error, but the problem is probably that the THEN is returning a string and the ELSE a number -- so there is an attempt to convert the string values to a number.
Also, learn to use proper, explicit, standard JOIN syntax. Simple rule: Never use commas in the FROM clause.
count distinct on the SooOrderNumber or the SooParentOrderReference, whichever makes more sense for you.
If you are COUNTing, you need to make NULL the thing that your are not counting. I prefer to include an else in the case because it is more consistent and complete.
, Count(DISTINCT CASE WHEN SopOrder_0.SooParentOrderReference LIKE 'INT%' THEN SopOrder_0.SooParentOrderReference ELSE null END) AS 'INT'
Gordon Linoff is correct regarding the source of your error, i.e. datatype mismatch between the case then value else value end. null removes (should remove) this ambiguity - I'd need to double check.
Editing my earlier answer...
Even though it looks, as you say, like count distinct is not supported in Pervasive PSQL, CTEs are supported. So you can do something like...
This is what you are trying to do but it is not supported...
with
dups as
(
select 1 as id, 'A' as col1 union all select 1, 'A' union all select 1, 'B' union all select 2, 'B'
)
select id
,count(distinct col1) as col_count
from dups
group by id;
Stick another CTE in the query to de-duplicate the data first. Then count as normal. That should work...
with
dups as
(
select 1 as id, 'A' as col1 union all select 1, 'A' union all select 1, 'B' union all select 2, 'B'
)
,de_dup as
(
select id
,col1
from dups
group by id
,col1
)
select id
,count(col1) as col_count
from de_dup
group by id;
These 2 versions should give the same result set.
There is always a way!!
I cannot explain the error you are getting. You are mistakenly using single quotes for alias names, but I don't actually think this is causing the error.
Anyway, I suggest you aggregate your order items per order first and only join then:
SELECT
c.coacompanyname
, so.sooorderdate
, COUNT(*) AS orders
, SUM(soi.itemcount) AS order_items
, SUM(soi.ordervalue) AS order_value
, COUNT(CASE WHEN so.sooparentorderreference LIKE 'INT%' THEN 1 END) AS int
, COUNT(CASE WHEN so.sooparentorderreference LIKE 'WEB%' THEN 1 END) AS web
FROM sbs.pub.company c
JOIN sbs.pub.soporder so ON so.companyid = c.companyid
JOIN
(
SELECT soporderid, COUNT(*) AS itemcount, SUM(soivalue) AS ordervalue
FROM sbs.pub.soporderitem
GROUP BY soporderid
) soi ON soi.soporderid = so.soporderid
GROUP BY c.coacompanyname, so.sooorderdate
ORDER BY c.coacompanyname, so.sooorderdate;
I have a result set such as:
Code No
1 *
1 -
1 4
1
1
Now i basically want a query that has 2 columns, a count for the total amount and a count for those that dont have numbers.
Code No_Number Total
1 4 5
Im assuming this needs a group by and a count but how can i do the 2 different counts in a query like this?
This is what i had so far, but i am a bit stuck with the rest of it
SELECT CODE,NO
Sum(Case when No IN ('*', '-', '') then 1 else 0 end) as Count
I think you basically just need GROUP BY:
SELECT CODE,
SUM(Case when No IN ('*', '-', '') then 1 else 0 end) as Count,
COUNT(*) as total
FROM t
GROUP BY CODE;
Well, this took a moment :-), however here it is...I have used a CASE statement to create and populate the No_Number column; the database gives the row in the original table a value of 1 if the original table value is a number or gives it a NULL and discards it from the COUNT if not. Then when it makes the count it is only recognising values which were originally numbers and ignoring everything else..
If the result set is in a table or temp table:
SELECT Code,
COUNT(CASE WHEN [No] NOT LIKE '[0-9]' THEN 1 ELSE NULL END) AS No_Number,
COUNT(Code) AS Total
FROM <tablename>
GROUP BY Code
If the result set is the product of a previous query you can use a CTE (Common Table Expression) to arrive at the required result or you could include parts of this code in the earlier query.