identifying a calculated column and reading value from it - sql

I have a simple table containing data of a batch of students and
their score for different years (the data may not be realistic, but it's
just an example).
Name Dept HOD Year1 Year2 Year3 Year4
Sam Science Christie 76.23 34.65 45.67 23.45
Mike Science Christie 57.987 26.98 43.98 78.34
Bonny Maths Christie 64.87 67.23 34.09 12.87
Ben English Simon 43.98 54.76 55.87 76.87
Now the requirement is, considering 2015 as point of reference, if the
user enters 2018 and wants data for Sam in Science department under
Christie's management, then the value from column Year3 (i.e.2018-2015)
is expected for all those conditions.
For example -
[case
when Name='Sam' and Dept='Science' and HOD='Christie' then Year*
when Name='Bonny' and Dept='Maths' and HOD='Christie' then Year*
when Name='Ben' and Dept='English' and HOD='Simon' then Year*
end]
I have already tried the sql -
select Value from (
select concat('Year', abs(2018-2015)) as Value from Class where
Name='Sam' and Dept='Science' and HOD='Christie')
So, when I am hardocoding Year3 in the above query instead of the formula,
it's working fine. When I separately firing computed Value, its is giving
output.
select concat('Year', abs(2018-2015) as Value from Class
But when I am integrating the two , my query is only giving Year3 as a
string. Whereas I want to pick the value for that column.
May be I am doing something wrong, I am not sure, but any suggestion is
welcome to solve this problem.
I came across a post advising use of coalesce() for dynamically calling a
column. so i tried that too -
select coalesce(Value, T.period,0) as Value from (
select concat('Year',(abs(2018-2015)))as period, Name, Dept, HOD from
Class where Name='Sam' and Dept='Science' and HOD='Christie') as T
where T.Name='Sam' and T.Dept='Science' and T.HOD='Christie'
But I am receiving error -
Error while compiling statement: FAILED: SemanticException [Error 10004]:
line 1:16 Invalid table alias or column reference 'Value': (possible
column names are: period, Name, Dept, HOD)

Is this what you want?
select (case when #year = 2016 then year1
when #year = 2017 then year2
when #year = 2018 then year3
when #year = 2019 then year4
end) as value
from class c
where hod = 'Christie'

Related

Trouble pivoting data in DB2

Before this one is marked as duplicate please know I have done my research on Pivoting in DB2 (even though DB2 doesnt have PIVOT) from these links
Pivoting in DB2 on SO and IBM Developers, but I just cant make sense of how to do it with my Data and need some help. I tried to manipulate my string using examples from both of those links and could not get it to work. Im not asking for anyone to write the full code for me, but just give me a point in the right direction on how to change my string to retrieve the desired result. Thank you in advance.
Current String:
SELECT
cfna1 AS "Customer Name", cfrisk AS "Risk Rating", cfrirc AS "Rated By", date(digits(decimal(cfrid7 + 0.090000, 7, 0))) AS "Risk Rated Date",cfuc3n3 AS "Credit Score", date(digits(decimal(cf3ud7 + 0.090000, 7, 0))) AS "CR Date"
FROM cncttp08.jhadat842.cfmast cfmast
WHERE cfcif# IN ('T000714', 'T000713', 'T000716', 'T000715')
ORDER BY
CASE cfcif#
WHEN 'T000714' THEN 1
WHEN 'T000713' THEN 2
WHEN 'T000716' THEN 3
WHEN 'T000715' THEN 4
END
Result as expected from String:
Customer Name | Risk Rating | Rated By | Risk Rated Date | Credit Score | CR Date
Elmer Fudd 8 MLA 2018-02-08 777 2018-02-08
Result I would like to achieve:
Elmer Fudd
Risk Rating 8
Rated By MLA
Risk Rated Date 2018-02-08
Credit Score 777
CR Date 2018-02-08
Use unpivot method suggested in developers link and use cast to convert all columns to varchar.
Example:
select st1.id1, unpivot1.col1, unpivot1.val1
from (
select id1, char1 , date1, number1
from sometable
) st1,
lateral (values
('char col', cast(st1.char1 as varchar(100))),
('date col', cast(st1.date1 as varchar(100))),
('number col', cast(st1.number1 as varchar(100)))
) as unpivot1 (col1, val1)
order by st1.id1
I don't think that output is possible in sql -- do you mean something like this?
id_group Data_Type Value
1 Name Elmer Fudd
1 Risk Rating 8
1 Rated By MLA
1 Risk Rated Date 2018-02-08
1 Credit Score 777
1 CR Date 2018-02-08
To do this we need another column that brings all the elements together -- I called it "id_group" this is the column that identifys the group

finding duplicate rows with different IDs based on multiple columns

please forgive me if my jargon is off. I'm still learning!
I just started using Teradata, and to be honest has been a lot of fun. however, I have hit a road block that has stumped me for a while.
I successfully selected a table from a database that looks like:
ID service date name
1 service1 1/5/15 john
2 service2 1/7/15 steve
3 service3 1/8/15 lola
4 service4 1/3/15 joan
5 service5 1/5/15 fred
6 service3 1/3/15 joan
7 service5 1/8/15 oscar
Now I want to search the data base again to find any duplicate IDs (example: to see if service service1 with date 1/5/15 with name john exists on another row with a different ID.)
At first, I did something like this:
SELECT ID, service, date, name
FROM table
WHERE table.service = ANY(service1, service2, service3, service4, service5, service3, service5)
AND table.date = ANY('1/5/15', '1/7/15, '1/8/15', '1/3/15', '1/5/15', '1/3/15', '1/8/15')
AND table.name = ANY('john', 'steve', 'lola', 'joan', 'fred', 'joan', 'oscar');
But this is giving me more rows than I wanted.
example:
ID service date name
92 service3 1/8/15 steve
is of no use to me since I am looking for IDs that have the same combination of service, date, and name as of any of the other IDs in the above table.
something like this would be favorable:
ID service date name
609 service3 1/8/15 lola
since it matches than of ID 3.
I was curious to see if it were possible to treat the three columns (service, date, name) as a vector and maybe select the rows that match it that way?
ex
......
WHERE (table.service, table.date, table.name) = ANY((service3,1/8/15,lola), (service1, 1/5/15, john), ...etc)
My Teradata is down right now, So I have yet to try the above example. Nevertheless, any thoughts/feedback is greatly appreciated!
The following query may be what you are trying to achieve. This selects IDs for which the combination of service, date, and name appears more than once.
SELECT t1.ID
FROM yourTable t1
INNER JOIN
(
SELECT service, date, name
FROM yourTable
GROUP BY service, date, name
HAVING COUNT(*) > 1
) t2
ON t1.service = t2.service AND
t1.date = t2.date AND
t1.name = t2.name
This is a simple task for a Windowed Aggregate:
SELECT *
FROM tab
QUALIFY
COUNT(*) OVER (PARTITION BY service, date, name) > 1
This counts the number of rows with the same combination of values (like Tim Biegeleisen's Derived Table) but unlike a Standard Aggregate it keeps all rows. The QUALIFY is a nice Teradata syntax extension to avoid a Derived Table.
Don't hardcode values in your query unless you absolutely have to. Instead, take the query you already wrote and join to that.
SELECT dupes.*
FROM (your query) yourquery
JOIN table dupes
ON yourquery.service = dupes.service
AND yourquery.date = dupes.date
AND yourquery.name = dupes.name

Parsing inconsistent data with SQL

I need to write SQL to extract repeat location codes and separate out the sub-location detail. However, the data I am working with does not follow a set pattern.
Here's a sample of what the location codes look like (the real table has over 5,000 locations):
JR-DY-TIN
DY-RHOLD
DY-PREQ-TIN
GLVCSH
GLFLR
GLBOX1
GLBOX2
GLBOX3
GLBOXA
GLBOXB
GLBOXC
GLBOXD
GL
GL0001
GL0002
GL0003
GL0014
…
I was able to create a new column for the sub-location detail when it is numeric but that's all I have so far.
select
LocationCode,
REVERSE(LEFT(REVERSE(LocationCode),PATINDEX('%[A-Za-z]%',
REVERSE(LocationCode))-1)) AS PaddedNumbers
from LocationTable
Results...
LocationCode PaddedNumbers
------------ -------------
JR-DY-TIN
DY-RHOLD
DY-PREQ-TIN
GLVCSH
GLFLR
GLBOX1 1
GLBOX2 2
GLBOX3 3
GLBOXA
GLBOXB
GLBOXC
GLBOXD
GL
GL0001 0001
GL0002 0002
GL0003 0003
GL0014 0014
I still figure out how to display the following in two separate columns:
Location codes without the sub-locations detail, e.g. GLBOX , or just
the original location code if there is no sub-location, e.g. GLFLR.
Numeric and Nonnumeric sub-location detail at the same time, e.g. for
GLBOX have a column that displays 1, 2, 3,A, B, C, D, E, F.
Edit: If I am able to accomplish this the data should look like this:
LocationCode MainLoc SubLoc
------------ --------- ------
JR-DY-TIN JR-DY-TIN
DY-RHOLD DY-RHOLD
DY-PREQ-TIN DY-PREQ-TIN
GLVCSH GLVCSH
GLFLR GLFLR
GLBOX1 GLBOX 1
GLBOX2 GLBOX 2
GLBOX3 GLBOX 3
GLBOXA GLBOX A
GLBOXB GLBOX B
GLBOXC GLBOX C
GLBOXD GLBOX D
GL GL
GL0001 GL 0001
GL0002 GL 0002
GL0003 GL 0003
GL0014 GL 0014
Any help is appreciated.
Environment: SQL Server 2008 R2.
It seems like you want to use something like a parseInt feature, which is not available in SQL Server 2008. You can attempt to use cast, but that won't work with your datatype - varchar.
I'd suggest using a case statement to parse the complex logic you need. ie:
select
LocationCode,
case when left(LocationCode,5) like 'GLBOX%' then substring(LocationCode,5,2)
when left(LocationCode,3) like 'GL0%' then substring(LocationCode,3,4)
else 'null' end as ParsedLocationCode end
from LocationTable
John's answer seems basically correct. I would write it as:
select LocationCode,
(case when LocationCode like 'GLBOX%' then right(LocationCode, 1)
when LocationCode like 'GL%' then right(LocationCode, 4)
end) as ParsedLocationCode
from LocationTable
This changes:
Removes the unnecessary substring() before like.
Fixed a syntax error (probably a typo with an extra end).
Uses right(), just because it seems simpler.
DECLARE #LocationRef TABLE (Location NVARCHAR(20), Ref INT)
INSERT INTO #LocationRef VALUES
('JR-DY-TIN',0)
,('DY-RHOLD',0)
,('DY-PREQ-TIN',0)
,('GLVCSH',0)
,('GLFLR',0)
,('GLBOX1',6)
,('GLBOX2',6)
,('GLBOX3',6)
,('GLBOXA',6)
,('GLBOXB',6)
,('GLBOXC',6)
,('GLBOXD',6)
,('GL',0)
,('GL0001',3)
,('GL0002',3)
,('GL0003',3)
,('GL0014',3)
SELECT Location AS LocationCode
,LEFT(Location,CASE Ref WHEN 0 THEN LEN(Location) ELSE Ref - 1 END)
,RIGHT(Location,CASE Ref WHEN 0 THEN 0 ELSE LEN(Location) - Ref + 1 END)
FROM #LocationRef

How to load grouped data with SSIS

I have a tricky flat file data source. The data is grouped, like this:
Country City
U.S. New York
Washington
Baltimore
Canada Toronto
Vancouver
But I want it to be this format when it's loaded in to the database:
Country City
U.S. New York
U.S. Washington
U.S. Baltimore
Canada Toronto
Canada Vancouver
Anyone has met such a problem before? Got a idea to deal with it?
The only idea I got now is to use the cursor, but the it is just too slow.
Thank you!
The answer by cha will work, but here is another in case you need to do it in SSIS without temporary/staging tables:
You can run your dataflow through a Script Transformation that uses a DataFlow-level variable. As each row comes in the script checks the value of the Country column.
If it has a non-blank value, then populate the variable with that value, and pass it along in the dataflow.
If Country has a blank value, then overwrite it with the value of the variable, which will be last non-blank Country value you got.
EDIT: I looked up your error message and learned something new about Script Components (the Data Flow tool, as opposed to Script Tasks, the Control Flow tool):
The collection of ReadWriteVariables is only available in the
PostExecute method to maximize performance and minimize the risk of
locking conflicts. Therefore you cannot directly increment the value
of a package variable as you process each row of data. Increment the
value of a local variable instead, and set the value of the package
variable to the value of the local variable in the PostExecute method
after all data has been processed. You can also use the
VariableDispenser property to work around this limitation, as
described later in this topic. However, writing directly to a package
variable as each row is processed will negatively impact performance
and increase the risk of locking conflicts.
That comes from this MSDN article, which also has more information about the Variable Dispenser work-around, if you want to go that route, but apparently I mislead you above when I said you can set the value of the package variable in the script. You have to use a variable that is local to the script, and then change it in the Post-Execute event handler. I can't tell from the article whether that means that you will not be able to read the variable in the script, and if that's the case, then the Variable Dispenser would be the only option. Or I suppose you could create another variable that the script will have read-only access to, and set its value to an expression so that it always has the value of the read-write variable. That might work.
Yes, it is possible. First you need to load the data to a table with an IDENTITY column:
-- drop table #t
CREATE TABLE #t (id INTEGER IDENTITY PRIMARY KEY,
Country VARCHAR(20),
City VARCHAR(20))
INSERT INTO #t(Country, City)
SELECT a.Country, a.City
FROM OPENROWSET( BULK 'c:\import.txt',
FORMATFILE = 'c:\format.fmt',
FIRSTROW = 2) AS a;
select * from #t
The result will be:
id Country City
----------- -------------------- --------------------
1 U.S. New York
2 Washington
3 Baltimore
4 Canada Toronto
5 Vancouver
And now with a bit of recursive CTE magic you can populate the missing details:
;WITH a as(
SELECT Country
,City
,ID
FROM #t WHERE ID = 1
UNION ALL
SELECT COALESCE(NULLIF(LTrim(#t.Country), ''),a.Country)
,#t.City
,#t.ID
FROM a INNER JOIN #t ON a.ID+1 = #t.ID
)
SELECT * FROM a
OPTION (MAXRECURSION 0)
Result:
Country City ID
-------------------- -------------------- -----------
U.S. New York 1
U.S. Washington 2
U.S. Baltimore 3
Canada Toronto 4
Canada Vancouver 5
Update:
As Tab Alleman suggested below the same result can be achieved without the recursive query:
SELECT ID
, COALESCE(NULLIF(LTrim(a.Country), ''), (SELECT TOP 1 Country FROM #t t WHERE t.ID < a.ID AND LTrim(t.Country) <> '' ORDER BY t.ID DESC))
, City
FROM #t a
BTW, the format file for your input data is this (if you want to try the scripts save the input data as c:\import.txt and the format file below as c:\format.fmt):
9.0
2
1 SQLCHAR 0 11 "" 1 Country SQL_Latin1_General_CP1_CI_AS
2 SQLCHAR 0 100 "\r\n" 2 City SQL_Latin1_General_CP1_CI_AS

statistic syntax in access

I want to do some statistic for the Point in my appliation,this is the columns for Point table:
id type city
1 food NewYork
2 food Washington
3 sport NewYork
4 food .....
Each point belongs to a certain type and located at the certain city.
Now I want to caculate the numbers of points in different city for each type.
For example, there are two types here :food and sport.
Then I want to know:
how many points of `food` and `sport` at NewYork
how many points of `food` and `sport` at Washington
how many points of `food` and `sport` at Chicago
......
I have tried this:
select type,count(*) as num from point group by type ;
But I can not group the by the city.
How to make it?
Update
id type city
1 food NewYork
2 sport NewYork
3 food Chicago
4 food San
And I want to get something like this:
NewYork Chicago San
food 2 1 1
sport 1 0 0
I will use the html table and chart to display these datas.
So I need to do the counting, I can use something like this:
select count(*) from point where type='food' and city ='San'
select count(*) from point where type='food' and city ='NewYork'
....
However I think this is a bad idea,so I wonder if I can use the sql to do the counting.
BTW,for these table data,how do people organization their structure using json?
this's what you want:
SELECT city,
COUNT(CASE WHEN [type] = 'food' THEN 1 END) AS FoodCount,
COUNT(CASE WHEN [type] = 'sport' THEN 1 END) AS SportCount
FROM point
GROUP BY city
UPDATE:
To get the results in an aggregated row/column format you need to use a pivot table. In Access it's called a Crosstab query. You can use the Crosstab query wizard to generate the query via a nice UI or cut straight to the SQL:
TRANSFORM COUNT(id) AS CountOfId
SELECT type
FROM point
GROUP BY type
PIVOT city
The grouping is used to count the number of Id's for each type. The additional PIVOT clause groups the data by city and displays each grouping in a separate column. The end result looks something like this:
NewYork Chicago San
food 2 1 1
sport 1 0 0