Visual
I have tried the following SQL query already in SAS (if there is a better way to use SAS please let me know):
Select *
From [Database]
Having COUNT(Month)>1 and SUM(Amount)=0;
This then only shows User_ID 003 and User_ID 004. The problem is I need it to show User_ID 001's first and third row since they offset (zero-out) in the same month along with 003 and 004. Basically, I am looking for duplicate values in certain columns with amounts that offset in a dataset that has 20,000+ rows (the Visual link above is not related to the dataset I am working on).
I have also tried separating the data so only positive numbers are in the Amount column and negative numbers are in a second Amount2 column and ran the query to find -Amount in Amount2. This didn't work either.
Thanks in advance.
This will the user_ids where a duplicate case exists in a month
select a.user_id , a.month , amount
from my_table as a
inner join my_table_a as b
on a.user_id = b.user_id
and a.month = b.month
and a.amount = -1 * b.amount
Related
I have a query that I converted from Access and is currently working correctly in Teradata SQL Assistant. The data pulled is just a standard table full of all of the data I need.
What I am wondering is: Can something be added to this query that will essentially sum up all of the Exposure values and then only show the top 5 Divisions by greatest to smallest sum (of those Top 5). Also, transposing the data so that my Topics are the left most column.
Here is the working code, details omitted.
SELECT
A.AS_OF_DT
, B.DIVISION
, B.CLASS
, Sum(A.BALANCE/1000000) AS "Bal in MMs"
, Sum(A.EXPOSURE/1000000) AS "Exp in MMs"
, Sum(CASE WHEN A.STATUS = 'NACC' THEN (B.BALANCE/1000000) ELSE 0 END) AS "NPL Bal as MMs"
FROM DB.TABLE1 A LEFT JOIN DB.TABLE2 B ON A.NAICS = B.NAICS_CD
WHERE A.AS_OF_DT= '2017-03-31'
GROUP BY
A.AS_OF_DT,
B.DIVISION,
B.CLASS
ORDER BY SUM (A.EXPOSURE/1000000) DESC
Essentially I want the columns to be the following:
DIVISION|DATE|
Below DIVISION would only be the Top 5 DIVISIONS summarized by EXPOSURE (under DATE)
I can try and clarify if needed. Just let me know.
Thanks!
End result is to have a datapaste I can throw into Excel without the manual work of transposing the data in Excel along with writing formulas to rummage through the 1000's of results of the base query to find summarize the individual Divisions and then picking the top 5 each month.
Thanks!
Shill
To get the 5 top for each division, you can use QUALIFY.
Add this to the end of you query:
QUALIFY ROW_NUMBER() over (PARTITION BY AS_OF_DATE,DIVISION order by (SUM (A.EXPOSURE/1000000))
For your other questions, SQL Assistant isn't much of a presentation tool, it won't do what you are asking for.
If your query already work,
try replacing:
SELECT
By:
SELECT top 10
(line 1)
I'm new to sql and very novice with this package. SQL is running in Watson DashDB. For the past few hours I've been struggling to find the correct code.
The code is trying to accomplish a few things.
Create a new view called SENTIMENT
Join two tables together
Have the new table show 4 columns with A. USER_SCREEN_NAME, B. Total Tweets, C. Postive SENTIMENT Count D. Negative SENTIMENT Count
The code below only creates 2 columns, I am needing 4. SPACEX_SENTIMENTS.SENTIMENT_POLARITY contain both Negative and Positive.
CREATE VIEW SENTIMENT
AS
(SELECT SPACEX_TWEETS.USER_SCREEN_NAME, SPACEX_SENTIMENTS.SENTIMENT_POLARITY
FROM dash015214.SPACEX_TWEETS
LEFT JOIN dash015214.SPACEX_SENTIMENTS ON
SPACEX_TWEETS.MESSAGE_ID=SPACEX_SENTIMENTS.MESSAGE_ID);
SELECT USER_SCREEN_NAME, COUNT(1) tweetsCount
FROM dash015214.SENTIMENT
GROUP BY USER_SCREEN_NAME
HAVING COUNT (1)>1
ORDER BY COUNT (USER_SCREEN_NAME) DESC
FETCH FIRST 20 ROWS ONLY;
It looks like your view SENTIMENT has one row per tweet, with two columns: the user name and then a polarity column. From your comment, I assume the polarity column can have the values of either 'POSITIVE' or 'NEGATIVE'. I think you can get what you want with this query:
SELECT
USER_SCREEN_NAME,
COUNT(1) AS "Total Tweets",
COUNT(CASE SENTIMENT_POLARITY WHEN 'POSITIVE' THEN 1 ELSE NULL END) AS "Positive Tweets",
COUNT(CASE SENTIMENT_POLARITY WHEN 'NEGATIVE' THEN 1 ELSE NULL END) AS "Negative Tweets"
FROM
SENTIMENT
GROUP BY USER_SCREEN_NAME
HAVING COUNT(1) > 1
ORDER BY COUNT(1) DESC;
This will give you everyone with at least 2 tweets (is that what you want?), and tell you the number of tweets per user and how many were positive and how many were negative. Replace " with whatever your SQL uses to indicate column names.
I have two tables, A & B.
Table A has a column called Nominal which is a float.
Table B has a column called Units which is also a float.
I have a simple select query that highlights any differences between Nominals in table A & Units in table B.
select coalesce(A.Id, B.Id) Id, A.Nominal, B.Units, isnull(A.Nominal, 0) - isnull(B.Units, 0) Diff
from tblA A full outer join tblB B
on tblA.Id = tblB.Id
where isnull(A.Nominal, 0) - isnull(B.Units, 0) <> 0
this query works. However this morning I have a slight problem.
The query is showing on line as having a difference,
Id Nominal Units Diff
FJLK 100000 100000 1.4515E-11
So obviously one or both of the figures are not 100,000 exactly. However when I run a select query on both tables (individually) on this id both of them return 100,000 I can't see which one has decimal places, why is this? Is this some sort of default display in SQL Server?
In the excel you will find this kind of behavior.
It's a standard way to represent a low numbers. The number 1.4515E-11 you got is same 1.4515 * 10^(-11)
I have seen other questions like this one but feel mine is a bit different, or didn't quite understand the SQL in the other questions...so my apologies if this one is redundant or very easy..
Anyway, I have an accounting transaction DB that stores every transaction posting within our financial system on one line. What I am trying to do is net the sum of the debits and the credits for each GL account.
Here are the two basic queries I am executing to get the results that I would like to net.
Query 1 gives me the sum of all debit transactions posting to each gl account:
Select gl_debit, sum (amt) from FISC_YEAR2014 where fund = 'XXX'
group by gl_debit
Query 2 gives me the sum of all credit transactions posting to each gl account:
select gl_credit, sum (amt) from FISC_YEAR2014 where fund = 'XXX'
group by gl_credt
Now I would to subtract the credit amounts from the debit amounts to get net totals for each gl account. Make sense?
Thanks.
There are two ways to do this depending our your table definition. I think your situation is the first.
This is the normal way assuming credits and debits are in separate columns:
SELECT sum(gl_debit)-sum(gl_credit) as net_debit
FROM FISC_YEAR2014
WHERE fund = 'XXX'
This is the other way assuming direction is indicated by a separate column:
SELECT SUM(IF(is_debit=1,amount,-1*amount)) as net_debit
FROM FISC_YEAR2014
WHERE fund = 'XXX'
See also:
MySQL 'IF' in 'SELECT' statement
Can't calculate totals in general ledger report
What's a good way to store a financial ledger?
I believe this is what you need:
select
gl_account,
sum(amt)
from
(
select gl_debit gl_account,
sum(-amt) amt
from fisc_year2014
where fund = 'XXX'
group by gl_debit
union all
select gl_credit,
sum(amt)
from fisc_year2014
where fund = 'XXX'
group by gl_credit
)
group by
gl_account
There are two SELECTs: one to get the (negative) debits and another to get the credits. They are UNIONed to create a two-column result. The outer SELECT then aggregates the total sum by the gl_account code. If there is a mismatch (a gl_debit without a gl_credit, or vice-versa), then its amount would still be displayed.
SQLFiddle here (I added another row to show the effect of mismatched IDs)
To do this you should SUM the debits and credits separately in subqueries, then join those subqueries on gl_credit = gl_debit.
SELECT COALESCE(gl_credit, gl_debit) AS Id
,COALESCE(d.amt,0)-COALESCE(c.amt,0) AS Net
FROM (
SELECT gl_debit, SUM(amt) AS amt
FROM FISC_YEAR2014
GROUP BY gl_debit
) d
FULL OUTER JOIN (
SELECT gl_credit, SUM(amt) AS amt
FROM FISC_YEAR2014
GROUP BY gl_credit
) c ON d.gl_debit = c.gl_credit
ORDER BY COALESCE(gl_credit, gl_debit)
SQLFiddle
Outputs:
ID Net
-----------
101 -475
201 225
301 500
501 -250
If I were you rather than using a FULL OUTER JOIN, I'd select the ids from the accounts table or wherever you store them, then LEFT JOIN both of the subqueries to it, you haven't shown any other tables though so I can only speculate.
Edited
I am running into an error and I know what is happening but I can't see what is causing it. Below is the sql code I am using. Basically I am getting the general results I want, however I am not accurately giving the query the correct 'where' clause.
If this is of any assistance. The count is coming out as this:
Total Tier
1 High
2 Low
There are 4 records in the Enrollment table. 3 are active, and 1 is not. Only 2 of the records should be displayed. 1 for High, and 1 for low. The second Low record that is in the total was flagged as 'inactive' on 12/30/2010 and reflagged again on 1/12/2011 so it should not be in the results. I changed the initial '<=' to '=' and the results stayed the same.
I need to exclude any record from Enrollments_Status_Change that where the "active_status" was changed to 0 before the date.
SELECT COUNT(dbo.Enrollments.Customer_ID) AS Total,
dbo.Phone_Tier.Tier
FROM dbo.Phone_Tier as p
JOIN dbo.Enrollments as eON p.Phone_Model = e.Phone_Model
WHERE (e.Customer_ID NOT IN
(Select Customer_ID
From dbo.Enrollment_Status_Change as Status
Where (Change_Date >'12/31/2010')))
GROUP BY dbo.Phone_Tier.Tier
Thanks for any assistance and I apologize for any confusion. This is my first time here and i'm trying to correct my etiquette on the fly.
If you don't want any of the fields from that table dbo.Enrollment_Status_Change, and you don't seem to use it in any way — why even include it in the JOINs? Just leave it out.
Plus: start using table aliases. This is very hard to read if you use the full table name in each JOIN condition and WHERE clause.
Your code should be:
SELECT
COUNT(e.Customer_ID) AS Total, p.Tier
FROM
dbo.Phone_Tier p
INNER JOIN
dbo.Enrollments e ON p.Phone_Model = e.Phone_Model
WHERE
e.Active_Status = 1
AND EXISTS (SELECT DISTINCT Customer_ID
FROM dbo.Enrollment_Status_Change AS Status
WHERE (Change_Date <= '12/31/2010'))
GROUP BY
p.Tier
Also: most likely, your EXISTS check is wrong — since you didn't post your table structures, I can only guess — but my guess would be:
AND EXISTS (SELECT * FROM dbo.Enrollment_Status_Change
WHERE Change_Date <= '12/31/2010' AND CustomerID = e.CustomerID)
Check for existence of any entries in dbo.Enrollment_Status_Change for the customer defined by e.CustomerID, with a Change_Date before that cut-off date. Right?
Assuming you want to:
exclude all customers whose latest enrollment_status_change record was since the start of 2011
but
include all customers whose latest enrollment_status_change record was earlier than the end of 2010 (why else would you have put that EXISTS clause in?)
Then this should do it:
SELECT COUNT(e.Customer_ID) AS Total,
p.Tier
FROM dbo.Phone_Tier p
JOIN dbo.Enrollments e ON p.Phone_Model = e.Phone_Model
WHERE dbo.Enrollments.Active_Status = 1
AND e.Customer_ID NOT IN (
SELECT Customer_ID
FROM dbo.Enrollment_Status_Change status
WHERE (Change_Date >= '2011-01-01')
)
GROUP BY p.Tier
Basically, the problem with your code is that joining a one-to-many table will always increase the row count. If you wanted to exclude all the records that had a matching row in the other table this would be fine -- you could just use a LEFT JOIN and then set a WHERE clause like Customer_ID IS NULL.
But because you want to exclude a subset of the enrollment_status_change table, you must use a subquery.
Your intention is not clear from the example given, but if you wanted to exclude anyone who's enrollment_status_change as before 2011, but include those who's status change was since 2011, you'd just swap the date comparator for <.
Is this any help?