SQL Server: DISTINCT with joins [closed] - sql

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 months ago.
Improve this question
I want to print total number of orders of each state year-wise. But this is printing multiple states and not distinct.
SELECT DISTINCT
customer_state,
COUNT(*),
YEAR(order_purchase_timestamp) AS year
FROM
olist_orders_dataset
JOIN
olist_customers_dataset ON olist_orders_dataset.customer_id = olist_customers_dataset.customer_id
GROUP BY
YEAR(order_purchase_timestamp), customer_state
I am getting this output:
State
Year
Num_orders
AC
2020
123
AC
2020
1234
AC
2019
234
Here is the Required Output:
State
Year
Num_orders
AC
2020
19995
CA
2020
188891
AL
2019
11999

Firstly, I don't think it should be necessary to point out that your "output" doesn't match your query. Different column order and names. And somehow 2 rows for CA (123 and 1234) become 1 row (19995)? Math isn't that difficult. Simple oversights of that nature suggest a lack of effort.
Your output suggests that the two values you see as "AC" are not actually the same. Typically this means there is a trailing character that is not displayable (e.g., tab or linefeed). If that is the case, then you must first "fix" your data and then fix the process that is populating your table.
To verify that this is the problem, you can convert the column to varbinary to see the hex values stored in it. Example:
select customer_state, cast(customer_state as varbinary(20)) as bin_state, count(*)
from #tbl
where customer_state like 'AC%'
group by customer_state order by customer_state
;
fiddle to demonstrate. If my guess is correct, then there is an important lesson to learn. Stop throwing code into a query to fix a problem without understanding the cause.
And one last note. Aggregation tends to produce rows in some order. That is an artifact of the execution plans. If the order of rows in your resultset matters (and it usually does), the query MUST have an ORDER BY clause. That applies to every query - not just those using aggregation.

Related

Get intermediate time periods in SQL Server [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I want to generate a table in SQL of intermediate joined states. E.g. I have the following table
status_1 status_2 start_date_V1 end_date_v1 start_date_2 end_date_v2
--------------------------------------------------------------------------------
A B 01Jan2018 31Jul2018 31Dec2017 31Jan2018
A C 01Jan2018 31Jul2018 01Feb2018 30Dec2018
In this table there are start and end dates of the different states "status_1" and "status_2". I wan to have the information about the changes of the two joined states. The desired table would be:
status_1 status_2 start_date end_date
-----------------------------------------------
A B 01Jan2018 31Jan2018
A C 01Feb2018 31Jul2018
The following image might help to understand the problem:
Can anyone help?
Seems like you need the intersecting time period(?), that'd be solved with a simple 'CASE-WHEN-ELSE'-statement for each date in the query result.
SELECT
[status1],
[status2],
[start_date] = CASE WHEN [start_date_V1] < [start_date_2] THEN [start_date_V1] ELSE [start_date_2] END,
[end_date] = CASE WHEN [end_date_v1] < [end_date_v2] THEN [end_date_v1] ELSE [end_date_v2] END
FROM Table
If you've got many date columns (known amount), it'd be cleaner to type it as below. However, beware that sub queries like this can slow down your queries tremendously, if you don't know what you're doing.
SELECT
Status1,
Status2,
-- New Name Name of custom group of values Column1 Column2 Name of custom group of values
-- | | | | |
[start_date] = (SELECT MAX(StartDate) FROM (VALUES (start_date_1), (start_date_2)) AS value(StartDate)),
[end_date] = (SELECT MIN(EndDate) FROM (VALUES (end_date_1), (end_date_2)) AS value(EndDate))
FROM Table

How to use Sum() function with condition? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have an Access database. There is a table named cost with bottom value:
reson Cost Type
------------ ------ ------
A1 2500 1
A1 6500 1
A2 95000 2
A3 2500 1
A1 6500 1
A4 50000 2
Now I want a query that calculate sum of all cost filed where type = 2 and sum of cost filed where type = 1 and substract the first value from the second value.
For example, the above pic calculate final result:
Sum of Type 2 = 145000
Sum of Type 1 = 18000
-------------------------
Final Result = 127000
My Sql Code
select iif(type = 2, sum(cost), -sum(cost)) As col1 from cost group by type
First off, I'm sorry you have to deal with such obnoxious hostility when asking your question here. You asked your question perfectly fine, laying out your table structure, and your desired result. It's understandable that you are new to queries and need help creating them. Not every answer requires code, and not every person knows where to start.
Here is your answer:
Step 1
Make sure you have your table created with the data you provided
Step 2
Create a new query named qySumType1. Build it like this, so it sums everything of type=1. make sure to click the totals button.
Step 3
Create another query, name this one qySumType2. This query should sum everything of type=2.
Step 4
Now create another query called "Final". Add both of your previous queries to it. Now create an expression in the last column to calculate the difference between the 2 numbers. Just like this.
And there you have it. Now just run the Final query anytime you want to get the difference.
Hope this helps! I can't tell you how many times I've started learning something new and relied on a community to help me get started. Always just try your best and wait for a decent answer to your question. Good luck!
Change T1 to the name of your table.
SELECT Sum(T.Type1) AS Type1, Sum(T.Type2) AS Type2, Sum(T.Type2) - Sum(T.Type1) AS DIFF
FROM
(
SELECT Sum(T1.Cost) AS Type1, 0 AS Type2
FROM T1
WHERE (((T1.Type)=1))
UNION
SELECT 0 AS Type1, Sum(T1.Cost) AS Type2
FROM T1
WHERE (((T1.Type)=2))
) AS T;
Type1 | Type2 | DIFF
18000 | 145000 | 127000

SQL query to extract a number and its decimal variations [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I have column with all kinds of numbers, I am specifically trying to extract numbers that have either
555
or
555.xx
or
555.x
The output should look like this
555
555.1
555.5
555.9
555.58
555.22
.
.
IE I need an sql query that will return the rows that have just the number 555 with any decimal fraction from my column of arbitrary numbers.
You can try LIKE statement
WHERE Col LIKE '555.%'
OR Col = '555'
As a fast approach I would do
CAST((ValueOfTable * 100.00) AS DECIMAL(18, 2))
my table name is Diagnosis and the column name is code, where should I
add the table name and column name in this code ?
In your situation:
SELECT
CAST((code * 100.00) AS DECIMAL(18, 2))
FROM Diagnosis ;
I'm expecting this to be an integer. You can find out executing:
\d Diagnosis ;
one of the output lines should look similar to
(...)
code | integer |
(...)
Assuming the column contains numbers (not as string/varchar), search for "number>=555 and number<556". This would give you 555, 555.01... etc.
If the CODE column is a varchar (which I understand it to be from your comments, but you might want clarify that in the question body itself), and can/does contain values that are not numbers, then you have to be very careful about using functions which only accept a number.
With this sample data, you can see that having one value that can't be cast to a number will cause the whole query to error out.
select * from diagnosis order by code;
CODE
-----------
555
555.0
555.43
555.99
Not a Num
(5 rows)
select code + 1 from diagnosis;
ERROR: pg_atoi: error in "Not a Num": can't parse "Not a Num"
The usual solution to this is to either match the column value via regular expression, or use a function to test whether or now the value in each row is a number.
Here are two solutions, each of which depends on a function that is provided with Netezza, but not necessarily installed by default. Your administrator can install these for you.
The first uses the regexp_instr from the SQL Extension Toolkit. Here you use a regular expression to match the values you want without having to do an actual CAST (implicit or explicit) to a numeric.
SELECT code FROM diagnosis
WHERE regexp_instr(code, '^555(\.\d+)?$') > 0;
CODE
--------
555
555.0
555.43
555.99
(4 rows)
The second solution, which is a bit more involved, uses the isnumeric() UDF provided as part of the Netezza InDatabase Analytics package (in the /nz/extensions/nz/nzlua/examples directory when installed), to test whether the CODE column is a numeric before casting CODE as a numeric.
SELECT code
FROM (
SELECT code
FROM diagnosis
WHERE isnumber(code)
)
foo
WHERE floor(code::NUMERIC(38,2)) = 555;
CODE
--------
555
555.0
555.43
555.99
(4 rows)
Both of these functions are included with Netezza, but both require installation by your administrator before you can use them. In each case this is a simple task for the administrator, although they may not be aware of their availability.

Oracle SQL : Union and Joins [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I have a table ACCPLAN(PRIMARY KEY : ACCOUNT_ID)
ACCOUNT_ID PLAN_TYPE OTHER_STUFF
ACC1 PLAN_TYPE_ONE ....
ACC2 PLAN_TYPE_TWO ....
ACC3 PLAN_TYPE_ONE ....
ACC4 PLAN_TYPE_TWO ...
I have one more table ACCTRANSACTION (PRIMARY KEY -> (ACCOUNT_ID,TRANSACTION_ID)
ACCOUNT_ID TRANSACTION_ID TRANSACTION_AMOUNT TXN_TYPE
ACC1 1 100 TXN_TYPE_1
ACC1 2 300 TXN_TYPE_2
ACC2 1 400 TXN_TYPE_2
ACC3 1 400 TXN_TYPE_3
There are 5 fixed plan_types and 20 fixed txn_types.Only few transactions types are
possible for each plan_type.(For eg : TXN_TYPE_1 and TXN_TYPE_2 are possible for
PLAN_TYPE_ONE and TXN_TYPE_2 and TXN_TYPE_3 are possible for PLAN_TYPE_TWO)
I am trying to retrieve the transaction information from ACCTRANSACTION and other
details from ACCPLAN
This can be done in 2 ways
APPROACH 1
Retrieve for each plan_type and do an union
select ap.account_id,ap.other_stuff,at.transaction_amount
from accplan ap, acctransaction at
where ap.account_id = at.account_id
and ap.plan_type = PLAN_TYPE_ONE
and at.txn_type in (TXN_TYPE_1,TXN_TYPE_2);
union
select ap.account_id,ap.other_stuff,at.transaction_amount
from accplan ap, acctransaction at
where ap.account_id = at.account_id
and ap.plan_type = PLAN_TYPE_TWO
and at.txn_type in (TXN_TYPE_2,TXN_TYPE_3);
union
...
APPROACH 2
Retrieve using one query for all plan_types
select ap.account_id,ap.other_stuff,at.transaction_amount
from accplan ap, acctransaction at
where ap.account_id = at.account_id
and
((ap.plan_type = PLAN_TYPE_ONE and at.txn_type in (TXN_TYPE_1,TXN_TYPE_2))
or
(ap.plan_type = PLAN_TYPE_TWO and at.txn_type in (TXN_TYPE_2,TXN_TYPE_3));
which approach is better considering both tables have huge data?. Please suggest.
Use joins. Unions require sorting the whole result and it is an expensive operation for your database.
Furthermore. It is better to read the table one time and do some complex checks with each record than reading it several times just to make smaller checks.
Disclaimer: I can imagine some very strange corner cases where the first query runs faster if the database query planner decides that the big condition is not selective enough and does not uses an index and each of the smaller one does use it. The bigger the number of rows the more I would use the second option.

Sql counting distinct records and grouping the count by the record name [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Thanks for taking your time to read and help me out! I appreciate it.
I have a database like so:
1 VISA TYPE ISSUING NATIN
2 F1 EN
3 J1 MX
Im trying to make a query that only looks at the VISA TYPES F1 and J1. Only looking at those records and ignoring those that arent F1 and J1, I need to count how many are from each country. Then store the result data in a table like so below.
1 ISSUING NATIN NUMBER OF STUDENTS
2 EN 5
3 MX 10
Here is what I was trying. Im bran new to access and MySQL. So keep that in mind.
SELECT Count(*) AS N
FROM (SELECT DISTINCT 'ISSUEING NATN' FROM 201310)
WHERE [201310].[VISA TYPE]='F1' OR [201310].[VISA TYPE]='J1'
GROUP BY 'ISSUING NATIN';
UPDATE: Well I got this baffiling problem fixed... how? Dont ask me... looks almost the same as eggy's.
SELECT
[ISSUED NATION],
COUNT(*)
FROM
201310
GROUP BY
[ISSUED NATION];
Right now im getting a syntax error with GROUP BY ISSUING NATIN, but I think I need to change some other things aswell. Any insight? Thank you!
EDIT: Fixed the issuing natin syntax error but its not working correctly!
SELECT [ISSUING NATIN], COUNT(*)
FROM [201310]
WHERE [VISA TYPE] IN ('F1', 'J1')
GROUP BY [ISSUING NATIN]