I have a Sybase table (which I can't alter) that I am trying to get into a specific table format. The table contains three columns all which are string values, with an id (which is not unique), a "position" which is a number that represents a field name, and a field column that is the value. The table looks like:
id position field
100 0 John
100 1 Jane
100 2 25
100 3 50
101 0 Dave
101 3 30
Position 0 means "SalesRep1", Position 1 means "SR1Commission", Position 2 means "SalesRep2", and Position 3 means "SR2Commission".
I am trying to get a table that looks like following, with the Commission columns being decimals instead of strings:
id SalesRep1 SR1Commission SalesRep2 SR2Commisson
100 John 25 Jane 50
101 Dave 30 NULL NULL
I've gotten close using CASE, but I end up with only one value per row and not sure there's a way to do what I want. I also have problems with trying to get CAST included to change the commission values from strings to decimals. Here's what I have so far:
SELECT id
CASE "position" WHEN '0' THEN field END AS SalesRep1,
CASE "position" WHEN '1' THEN field END AS SalesRep2,
CASE "position" WHEN '2' THEN field END AS SR1Commission,
CASE "position" WHEN '3' THEN field END AS SR2Commission
FROM v_custom_field WHERE id = ?
This gives me the following result when querying for id 100:
id SalesRep1 SR1Commission SalesRep2 SR2Commission
100 John NULL NULL NULL
100 NULL 25 NULL NULL
100 NULL NULL Jane NULL
100 NULL NULL NULL 50
This is close, but I want to 'collapse' the rows down into one row based off of the id as well as cast the commission values to numbers. I tried adding in a CAST(field AS DECIMAL) I'm not sure if this is even the right direction to go, and was looking into PIVOT, but Sybase doesn't seem to support that.
This is known as an entity-attribute-value table. They're a pain to work with because they're one step removed from being relational data, but they're very common for user-defined fields in applications.
If you can't use PIVOT, you'll need to do something like this:
SELECT DISTINCT s.id,
f0.field AS SalesRep1,
CAST(f1.field AS DECIMAL(20,5)) AS SR1Commission,
f2.field AS SalesRep2,
CAST(f3.field AS DECIMAL(20,5)) AS SR2Commission
FROM UnnamedSalesTable s
LEFT JOIN UnnamedSalesTable f0
ON f0.id = s.id AND f0.position = 0
LEFT JOIN UnnamedSalesTable f1
ON f1.id = s.id AND f1.position = 1
LEFT JOIN UnnamedSalesTable f2
ON f2.id = s.id AND f2.position = 2
LEFT JOIN UnnamedSalesTable f3
ON f3.id = s.id AND f3.position = 3
It's not very fast because it's a ton of self-joins followed by a DISTINCT, but it does work.
Related
I have a wide table that looks like this:
Case REFERENCE
OUTCOME_EMP_SITUATION
MONTH1_EMP_SITUATION
MONTH1_REASON
MONTH3_EMP_SITUATION
MONTH3_REASON
MONTH6_EMP_SITUATION
MONTH6_REASON
12345
Employed
Employed
Outcome at 1 month
Employed
Outcome at 3 month
Employed
Outcome at 6 month
this is survey results that people completed after they finished employment program. They complete the survey 4 times, once immediately after finishing the program, and then after 1/3/6 month. the problem is, the results for immediately after program completion are in one table (Outcome table) and the 1/3/6 month checkpoint results are in another table (Checkpointinfo table) I would like to combine those tables to create a long table so that instead of having "Outcome" in 5 different columns, I would have it in one column and it would look like this:
Case Reference
Outcome_emp_situation
Month_Reason
12345
Employed
NULL
12345
Employed
Outcome at 1 month
12345
Employed
Outcome at 3 month
12345
Employed
Outcome at 6 month
I was wondering if anyone could please help me out to turn this wide query into a long table query.
Here is the query for the wide table:
Select
ch.CASEREFERENCE, oc.OUTCOME_DATE, oc.OUTCOME_REFERENCE_ID, oc.OUTCOME_EMP_SITUATION, oc.OUTCOME_EMPLOYMENT_TYPE, oc.OUTCOME_NUM_JOBS, oc.OUTCOME_NAICS_DESC, oc.OUTCOME_JOB_NATURE,
oc.OUTCOME_WORK_HOURS, oc.OUTCOME_WAGE, oc.OUTCOME_STUDENT_STATUS, oc.OUTCOME_GOT_SERVICE, oc.OUTCOME_RIGHT_SERVICE, oc.OUTCOME_RECOMMEND_PROGRAM,
ck1.REASONCODE AS REASONCODE1,
CASE WHEN ck1.REASONCODE = 'OT1' THEN "Outcome at 1 month" END MONTH1_REASON,
ck1.MONTH_START_DATE AS MONTH1_START_DATE, ck1.MONTH_END_DATE AS MONTH1_END_DATE, ck1.MONTH_OUTCOME_EMP_SITUATION AS MONTH1_OUTCOME_EMP_SITUATION,
ck1.MONTH_EMPLOYMENT_TYPE AS MONTH1_EMPLOYMENT_TYPE, ck1.MONTH_NUM_JOBS AS ,MONTH1_NUM_JOBS, ck1.MONTH_NAICS_DESC AS MONTH1_NAICS_DESC, ck1.MONTH_JOB_NATURE AS MONTH1_JOB_NATURE,
ck1.MONTH_WORK_HOURS AS MONTH1_WORK_HOURS, ck1.MONTH_WAGE AS MONTH1_WAGE, ck1.MONTH_STUDENT_STATUS AS MONTH1_STUDENT_STATUS, ck1.MONTH_GOT_SERVICE AS MONTH1_GOT_SERVICE,
ck1.MONTH_RIGHT_SERVICE AS MONTH1_RIGHT_SERVICE, ck1.MONTH_RECOMMEND_PROGRAM AS MONTH1_RECOMMEND_PROGRAM, ck1.MONTH_RESUBMIT_MILESTONE AS MONTH1_RESUBMIT_MILESTONE,
ck1.MONTH_MILESTONE_ACHIEVED AS MONTH1_MILESTONE_ACHIEVED, ck1.MONTH_APPROVED_DATE AS MONTH1_APPROVED_DATE,
ck3.REASONCODE AS REASONCODE3,
CASE WHEN ck3.REASONCODE = 'OT3' THEN "Outcome at 3 month" END MONTH3_REASON,
ck3.MONTH_START_DATE AS MONTH3_START_DATE, ck3.MONTH_END_DATE AS MONTH3_END_DATE, ck3.MONTH_OUTCOME_EMP_SITUATION AS MONTH3_OUTCOME_EMP_SITUATION,
ck3.MONTH_EMPLOYMENT_TYPE AS MONTH3_EMPLOYMENT_TYPE, ck3.MONTH_NUM_JOBS AS ,MONTH3_NUM_JOBS, ck3.MONTH_NAICS_DESC AS MONTH3_NAICS_DESC, ck3.MONTH_JOB_NATURE AS MONTH3_JOB_NATURE,
ck3.MONTH_WORK_HOURS AS MONTH3_WORK_HOURS, ck3.MONTH_WAGE AS MONTH3_WAGE, ck3.MONTH_STUDENT_STATUS AS MONTH3_STUDENT_STATUS, ck3.MONTH_GOT_SERVICE AS MONTH3_GOT_SERVICE,
ck3.MONTH_RIGHT_SERVICE AS MONTH3_RIGHT_SERVICE, ck3.MONTH_RECOMMEND_PROGRAM AS MONTH3_RECOMMEND_PROGRAM, ck3.MONTH_RESUBMIT_MILESTONE AS MONTH3_RESUBMIT_MILESTONE,
ck3.MONTH_MILESTONE_ACHIEVED AS MONTH3_MILESTONE_ACHIEVED, ck3.MONTH_APPROVED_DATE AS MONTH3_APPROVED_DATE,
ck6.REASONCODE AS REASONCODE6,
CASE WHEN ck6.REASONCODE = 'OT6' THEN "Outcome at 6 month" END MONTH6_REASON,
ck6.MONTH_START_DATE AS MONTH6_START_DATE, ck6.MONTH_END_DATE AS MONTH6_END_DATE, ck6.MONTH_OUTCOME_EMP_SITUATION AS MONTH6_OUTCOME_EMP_SITUATION,
ck6.MONTH_EMPLOYMENT_TYPE AS MONTH6_EMPLOYMENT_TYPE, ck6.MONTH_NUM_JOBS AS ,MONTH6_NUM_JOBS, ck6.MONTH_NAICS_DESC AS MONTH6_NAICS_DESC, ck6.MONTH_JOB_NATURE AS MONTH6_JOB_NATURE,
ck6.MONTH_WORK_HOURS AS MONTH6_WORK_HOURS, ck6.MONTH_WAGE AS MONTH6_WAGE, ck6.MONTH_STUDENT_STATUS AS MONTH6_STUDENT_STATUS, ck6.MONTH_GOT_SERVICE AS MONTH6_GOT_SERVICE,
ck6.MONTH_RIGHT_SERVICE AS MONTH6_RIGHT_SERVICE, ck6.MONTH_RECOMMEND_PROGRAM AS MONTH6_RECOMMEND_PROGRAM, ck6.MONTH_RESUBMIT_MILESTONE AS MONTH6_RESUBMIT_MILESTONE,
ck6.MONTH_MILESTONE_ACHIEVED AS MONTH6_MILESTONE_ACHIEVED, ck6.MONTH_APPROVED_DATE AS MONTH6_APPROVED_DATE
FROM PROGRAM as pg
LEFT JOIN CASEINFO as ch ON pg.CASEID = ch.CASEID
LEFT JOIN OUTCOME as oc ON pg.CASEID = oc.CASEID
LEFT JOIN ( SELECT cp.CASEID, cp.REASONCODE, cp.MONTH_OUTCOME_EMP_SITUATION, cpi.* FROM CHECKPOINT cp LEFT JOIN CHECKPOINTINFO cpi ON cp.CASEREVIEWID = cpi.CASEREVIEWID WHERE cpi.REASONCODE = 'OT1')ck1 ON pg.CASEID = ck1.CASEID
LEFT JOIN ( SELECT cp.CASEID, cp.REASONCODE, cp.MONTH_OUTCOME_EMP_SITUATION, cpi.* FROM CHECKPOINT cp LEFT JOIN CHECKPOINTINFO cpi ON cp.CASEREVIEWID = cpi.CASEREVIEWID WHERE cpi.REASONCODE = 'OT3')ck3 ON pg.CASEID = ck3.CASEID
LEFT JOIN ( SELECT cp.CASEID, cp.REASONCODE, cp.MONTH_OUTCOME_EMP_SITUATION, cpi.* FROM CHECKPOINT cp LEFT JOIN CHECKPOINTINFO cpi ON cp.CASEREVIEWID = cpi.CASEREVIEWID WHERE cpi.REASONCODE = 'OT6')ck6 ON pg.CASEID = ck6.CASEID
If someone could please help me turn this wide table into a long table, it would be much appreciated.
thank you
You need to do unpivot for outcome and reason columns. But first you need an extra column for overall reason. This is the query:
with a as (
select 12345 as case_reference,
'Employed' as OUTCOME_EMP_SITUATION,
'Employed' as MONTH1_EMP_SITUATION,
'Outcome at 1 month' as MONTH1_REASON,
'Employed' as MONTH3_EMP_SITUATION,
'Outcome at 3 month' as MONTH3_REASON,
'Employed' as MONTH6_EMP_SITUATION,
'Outcome at 6 month' as MONTH6_REASON
from dual
)
select
case_reference,
outcome_emp_situation,
month_reason
from (
select a.*,
cast(null as varchar2(1000)) as reason
from a
) a
unpivot(
(Outcome_emp_situation, Month_Reason)
for mon in (
(OUTCOME_EMP_SITUATION, reason) as 0,
(MONTH1_EMP_SITUATION, MONTH1_REASON) as 1,
(MONTH3_EMP_SITUATION, MONTH3_REASON) as 3,
(MONTH6_EMP_SITUATION, MONTH6_REASON) as 6
)
)
order by mon asc
CASE_REFERENCE | OUTCOME_EMP_SITUATION | MONTH_REASON
-------------: | :-------------------- | :-----------------
12345 | Employed | null
12345 | Employed | Outcome at 1 month
12345 | Employed | Outcome at 3 month
12345 | Employed | Outcome at 6 month
db<>fiddle here
UPD: The explanation below.
The tuple just after unpivot keyword is the result column names, column after for keyword identifies column group which produced that values. Tuples inside in define the columns' groups: for each group that columns' values will be passed to the corresponding (by position) columns of the result tuple and new row will be generated with the value of for column defined after as keyword.
So if you need more columns to be transferred to each row, you need to add new columns to the result tuple (after unpivot) and to each column group inside in. If for some reason you have not enough columns to pass for some groups, you can wrap your source query with outer select and add dummy (or constantly valued) columns for that groups.
Note:
Datatypes of each tuples should be the same (or convertible according to default datatype precedence). I.e. each tuple's member on the same position should have the same type, members at different positions may have different types.
You can reuse the same column in multiple groups and positions.
Yes, I know this seems simple:
SELECT DISTINCT(...)
Except, it apparently isn't
Here is my actual Query:
SELECT
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
CompletedTrainings.DecShotDate,
CompletedTrainings.DecShotLocation,
CompletedTrainings.DecReason,
CompletedTrainings.DecExplanation,
IIf([DecShotLocation]="MCS","Yes","No") AS YesMCS,
IIf([DecReason]=1,1,0) AS YesAllergy,
IIf([DecReason]=2,1,0) AS YesImmune,
IIf([DecReason]=3,1,0) AS YesAdverse,
IIf([DecReason]=4,1,0) AS YesMedical,
IIf([DecReason]=5,1,0) AS YesSpiritual,
IIf([DecReason]=6,1,0) AS YesOther,
IIf([DecReason]=7,1,0) AS YesAlready
FROM
EmployeeInformation
INNER JOIN (CompletedTrainings
LEFT JOIN DeclinationReasons ON CompletedTrainings.DecReason = DeclinationReasons.ReasonID)
ON EmployeeInformation.ID = CompletedTrainings.Employee
GROUP BY
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
CompletedTrainings.DecShotDate,
CompletedTrainings.DecShotLocation,
CompletedTrainings.DecReason,
CompletedTrainings.DecExplanation,
IIf([DecShotLocation]="MCS","Yes","No"),
IIf([DecReason]=1,1,0),
IIf([DecReason]=2,1,0),
IIf([DecReason]=3,1,0),
IIf([DecReason]=4,1,0),
IIf([DecReason]=5,1,0),
IIf([DecReason]=6,1,0),
IIf([DecReason]=7,1,0)
HAVING
((((EmployeeInformation.Active) Like -1)
AND ((CompletedTrainings.DecShotDate + 365 >= DATE())
OR (CompletedTrainings.DecShotDate IS NULL))));
This is Joining a few tables (obviously) in order to get a number of records. The problem is that if someone is duplicated on the table with a NULL in one of the date fields, and a date in another field, it pulls both the NULL and the DATE, or pulls multiple NULLS it might pull multiple dates but those are not present right at the moment.
I need the Nulls, they are actual data in this particular case, but if someone has a date and a NULL I need to pull only the newest record, I thought I could add MAX(RecordID) from the table, but that didn't change the results of the query either.
That code:
SELECT
DeclinationReasons.Reason,
EmployeeInformation.ID,
EmployeeInformation.Employee,
EmployeeInformation.Active,
MAX(CompletedTrainings.RecordID),
CompletedTrainings.DecShotDate
...
And it returned the same issue, Duplicated EmployeeInformation.ID with different DecShotDate values.
Currently it returns:
ID
Active
DecShotDate
etc. x a bunch
1
-1
date date
whatever goes
2
-1
in these
2
-1
date date
columns
These are being used in a report, that is to determine the total number of employees who fit the criteria of the report. The NULLs in DecShotDate are needed as they show people who did not refuse to get a flu vaccine in the current year, while the dates are people who did refuse.
Now I have come up with one simple solution, I could add a column to the CompletedTrainings Table that contains a date or other value, and add that to the HAVING statement. This might be the right solution as this is a yearly training questionnaire that employees have to fill out. But I am asking for advice before doing this.
Am I right in thinking I need to add a column to filter by so that older data isn't being pulled, or should I be able to do this by pulling recordID, and did I just bork that part of the query up?
Edited to add raw table views:
EmployeeInformation Table:
ID
Last
First
empID
Active
Termdate
DoH
Title
PT/FT/PD
PI
1
Doe
Jane
982
-1
date
Sr
PD
X
2
Roe
John
278
0
date
date
Jr
PD
X
3
Moe
Larry
1232
-1
date
Sr
FT
X
4
Zoe
Debbie
1424
-1
date
Sr
PT
X
DeclinationReasons Table:
ReasonID
Reason
1
Allergy
2
Already got it
3
Illness
CompletedTrainings Table:
RecordID
Employee
Training
...
DecShotdate
DecShotLocation
DecShotReason
DecExp
1
1
4
date
location
2
text
2
1
4
3
2
4
4
3
4
date
location
3
text
5
3
4
date
location
1
text
6
4
4
After some serious soul searching, I decided to use another column and filter by that.
In the end my query looks like this:
SELECT *
FROM (
(
SELECT RecordID, DecShotDate, DecShotLocation, DecReason, DecExplanation, Employee,
IIf([DecShotLocation]="MCS","Yes","No") AS YesMCS, IIf([DecReason]=1,1,0) AS YesAllergy,
IIf([DecReason]=2,1,0) AS YesImmune, IIf([DecReason]=3,1,0) AS YesAdverse,
IIf([DecReason]=4,1,0) AS YesMedical, IIf([DecReason]=5,1,0) AS YesSpiritual,
IIf([DecReason]=6,1,0) AS YesOther, IIf([DecReason]=7,1,0) AS YesAlready
FROM CompletedTrainings WHERE (CompletedDate > DATE() - 365 ) AND (Training = 69)) AS T1
LEFT JOIN
(
SELECT ID, Active FROM EmployeeInformation) AS T2 ON T1.Employee = T2.ID)
LEFT JOIN
(
SELECT Reason, ReasonID FROM DeclinationReasons) AS T3 ON T1.DecReason = T3.ReasonID;
This may not have been the best solution, but it did exactly what I needed. Which is to get the information by latest entry into the database.
Previously I had tried to use MAX(), DISTINCT(), etc. but always had a problem of multiple records being retrieved. In this case, I intentionally SELECT the most recent records first, then join them to the results of the next query, and so on. Until I have all the required data for my report.
I write this in hopes someone else finds it useful. Or even better if someone tells me why this is wrong, so as to improve my own skills.
I am trying in Athena to output only users which have some specific value in them but not in all of the rows
Suppose I have the table below.
I want all users which have value '100' in at least one of their rows but also having in other rows value different than 100.
user | value
A | 1
B | 2
A | 100
D | 3
A | 4
C | 3
C | 5
D | 100
So in this example I would want to get only users A and D because only them having 100 and none 100.
I tried maybe grouping by user and creating an array of values per user and then checking if array contains 100 but I don't manage doing it presto.
Also I thought about converting rows to columns and then checking if one of columns equals 100.
Those solutions are too complex? Anybody knows how to implement them or anyone has a better simpler solution?
The users that have at least one value of 100 can be found with this SQL:
SELECT DISTINCT user
FROM some_table
WHERE value = 100
But I assume you are after all tuples of user and value where the user has at least one value of 100, this can be accomplished by using the query above in a slightly more complex query:
WITH matching_users AS (
SELECT DISTINCT user
FROM some_table
WHERE value = 100
)
SELECT user, value
FROM matching_users
LEFT JOIN some_table USING (user)
You can use sub query as below to achieve your required output=
SELECT * FROM your_table
WHERE User IN(
SELECT DISTINCT User
FROM your_table
WHERE Value = 100
)
If you just want the users, I would go for aggregation:
select user
from t
group by user
having sum(case when value = 100 then 1 else 0 end) > 0;
If 100 is the maximum possible value, this can be simplified to:
having max(value) = 100
Say I have a table in an sql database like
name age shoesize
---------------------
tom 20 NULL
dick NULL 4
harry 30 5
and I want an SQL statement that selects names that have age == X, or as a fallback, if no such names exist, use those with shoe size == Y. In other words, in this table, for X=20,Y=4 I should only get 'tom', while for X=25,Y=4 I should get only 'dick'. I can't do that with
SELECT name FROM table WHERE age = 20 OR shoe size = 4;
because that will select both tom and dick. I'm currently using
SELECT COALESCE ((SELECT name FROM tab WHERE age = 20),(SELECT name FROM tab WHERE shoesize = 4));
but is there a neater way? Also using coalesce like this doesn't allow me to get the whole row - i.e. I can't use SELECT * FROM tab, I can only select a single name.
You can use ORDER BY and FETCH FIRST 1 ROW ONLY or some similar clause:
SELECT name
FROM tab
ORDER BY (CASE WHEN age = X THEN 1
WHEN shoesize = Y THEN 2
ELSE 3
END)
FETCH FIRST 1 ROW ONLY;
Some databases spell FETCH FIRST 1 ROW ONLY like LIMIT or TOP or even something else.
I wrote the following SQL to create a column that I can use to populate check boxes in a Grid to manage user permissions.
SELECT access_b2b.access_id,
access_b2b.description,
'active'= CASE
WHEN access_group.group_id IS NOT NULL THEN 1
ELSE 0
END
FROM access_b2b
LEFT JOIN access_group
ON access_group.access_id = access_b2b.access_id
WHERE ( access_group.group_id = 10
OR access_group.group_id IS NULL )
However, it does not select all of the entries from access_b2b. The issues is with the last line:
where (access_group.group_id=10 or access_group.group_id is null)
Without it, i get duplicate entries returned with different active values. Also, I realized that this is not the proper condition, because an entry in access_group might exist for a different access_group.group_id, meaning that not all the remaining entries will be pulled in with the access_group.group_id is null.
I am trying to write my condition so that if does something along the lines of:
This is the format I was trying to follow:
Where For Each unique access_id in access_group
select the one where group_id=10
if no group_id=10
select any other one
end
end
Ultimately, the goal is to have a column returned with 1 or 0 denoting if the access_id exists for a predetermined group id.
Please note that throughout this explanation I used group_id=10 for simplification, it will be later replaced with a SqlParameter.
Any help is appreciated, thank you so much!
SAMPLE DATA (only useful columns shown to simplify data)
access_group
access_id group_id
27 1
27 11
28 1
28 11
33 1
33 3
33 11
43 11
44 1
44 10
44 11
...
access_b2b
access_id description
1 Add
2 Edit
3 Delete
4 List
5 Payments
6 Open Files
7 Order
8 Mod
...
Change the query to and it should work:
SELECT access_b2b.access_id,
access_b2b.description,
'active'= CASE
WHEN access_group.group_id IS NOT NULL THEN 1
ELSE 0
END
FROM access_b2b
LEFT JOIN access_group
ON access_group.access_id = access_b2b.access_id
AND ( access_group.group_id = 10
OR access_group.group_id IS NULL )
If you don't want the records to be filtered by the WHERE clause, move the condition in the JOIN.
The JOIN will keep the lines and populate them with NULL if the condition is not met, while the WHERE clause will filter the result set.