select in sql server 2005 - sql

I have a table follow:
ID | first | end
--------------------
a | 1 | 3
b | 3 | 8
c | 8 | 10
I want to select follow:
ID | first | end
---------------------
a-c | 1 | 10
But i can't do it. Please! help me. Thanks!

This works for me:
SELECT MIN(t.id)+'-'+MAX(t.id) AS ID,
MIN(t.[first]) AS first,
MAX(t.[end]) AS [end]
FROM dbo.YOUR_TABLE t
But please, do not use reserved words like "end" for column names.

I believe you can do this using a recursive Common Table Expression as follows, especially if you're not expecting very long chains of records:
WITH Ancestors AS
(
SELECT
InitRow.[ID] AS [Ancestor],
InitRow.[ID],
InitRow.[first],
InitRow.[end],
0 AS [level],
'00000' + InitRow.[ID] AS [hacky_level_plus_ID]
FROM
YOUR_TABLE AS InitRow
WHERE
NOT EXISTS
(
SELECT * FROM YOUR_TABLE AS PrevRow
WHERE PrevRow.[end] = InitRow.[first]
)
UNION ALL
SELECT
ParentRow.Ancestor,
ChildRow.[ID],
ChildRow.[first],
ChildRow.[end],
ParentRow.level + 1 AS [level],
-- Avoids having to build the recursive structure more than once.
-- We know we will not be over 5 digits since CTEs have a recursion
-- limit of 32767.
RIGHT('00000' + CAST(ParentRow.level + 1 AS varchar(4)), 5)
+ ChildRow.[ID] AS [hacky_level_plus_ID]
FROM
Ancestors AS ParentRow
INNER JOIN YOUR_TABLE AS ChildRow
ON ChildRow.[first] = ParentRow.[end]
)
SELECT
Ancestors.Ancestor + '-' + SUBSTRING(MAX([hacky_level_plus_ID]),6,10) AS [IDs],
-- Without the [hacky_level_plus_ID] column, you need to do it this way:
-- Ancestors.Ancestor + '-' +
-- (SELECT TOP 1 Children.ID FROM Ancestors AS Children
-- WHERE Children.[Ancestor] = Ancestors.[Ancestor]
-- ORDER BY Children.[level] DESC) AS [IDs],
MIN(Ancestors.[first]) AS [first],
MAX(Ancestors.[end]) AS [end]
FROM
Ancestors
GROUP BY
Ancestors.Ancestor
-- If needed, add OPTION (MAXRECURSION 32767)
A quick explanation of what each part does:
The WITH Ancestors AS (...) clause creates a Common Table Expression (basically a subquery) with the name Ancestors. The first SELECT in that expression establishes a baseline: all the rows that have no matching entry prior to it.
Then, the second SELECT is where the recursion kicks in. Since it references Ancestors as part of the query, it uses the rows it has already added to the table and then performs a join with new ones from YOUR_TABLE. This will recursively find more and more rows to add to the end of each chain.
The last clause is the SELECT that uses this recursive table we've built up. It does a simple GROUP BY since we've saved off the original ID in the Ancestor column, so the start and end are a simple MIN and MAX.
The tricky part is figuring out the ID of the last row in the chain. There are two ways to do it, both illustrated in the query. You can either join back with the recursive table, in which case it will build the recursive table all over again, or you can attempt to keep track of the last item as you go. (If building the recursive list of chained records is expensive, you definitely want to minimize the number of times you need to do that.)
The way it keeps track as it goes is to keep track of its position in the chain (the level column -- notice how we add 1 each time we recurse), zero-pad it, and then stick the ID at the end. Then, getting the item with the max level is simply a MAX followed by stripping the level data out.
If the CTE has to recurse too much, it will generate an error, but I believe you can tweak that using the MAXRECURSION option. The default is 100. If you have to set it higher than that, you may want to consider not using a recursive CTE to do this.
This also doesn't handle malformed data very well. If you have two records with the same first or a record where first == end, then this won't work right and you may have to tweak the join conditions inside the CTE or go with another approach.
This isn't the only way to do it. I believe it would be easier to follow if you built a custom procedure and did all the steps manually. But this has the advantage of operating in a single statement.

Related

SQL Server 'AS' alias unexpected syntax

I've come across following T-SQL today:
select c from (select 1 union all select 1) as d(c)
that yields following result:
c
-----------
1
1
The part that got me confused was d(c)
While trying to understand what's going on I've modified T-SQL into:
select c, b from (select 1, 2 union all select 3, 4) m(c, b)
which yields following result:
c b
----------- -----------
1 2
3 4
It was clear that d & m are table reference while letters in brackets c & b are reference to columns.
I wasn't able to find relevant documentation on msdn, but curious if
You're aware of such syntax?
What would be useful use case scenario?
select c from (select 1 union all select 1) as d(c)
is the same as
select c from (select 1 as c union all select 1) as d
In the first query you did not name the column(s) in your subquery, but named them outside the subquery,
In the second query you name the column(s) inside the subquery
If you try it like this (without naming the column(s) in the subquery)
select c from (select 1 union all select 1) as d
You will get following error
No column name was specified for column 1 of 'd'
This is also in the Documentation
As for the usage, some like to write it the first method, some in the second, whatever you prefer. It's all the same
An observation: Using the table constructor values gives you no way of naming the columns, which makes it neccessary to use column naming after the table alias:
select * from
(values
(1,2) -- can't give a column name here
,(3,4)
) as tableName(column1,column2) -- gotta do it here
You've already had comments that point you to the documentation of how derived tables work, but not to answer you question regarding useful use cases for this functionality.
Personally I find this functionality to be useful whenever I want to create a set of addressable values that will be used extensively in your statement, or when I want to duplicate rows for whatever reason.
An example of addressable values would be a much more compelx version of the following, in which the calculated values in the v derived table can be used many times over via more sensible names, rather than repeated calculations that will be hard to follow:
select p.ProductName
,p.PackPricePlusVAT - v.PackCost as GrossRevenue
,etc
from dbo.Products as p
cross apply(values(p.UnitsPerPack * p.UnitCost
,p.UnitPrice * p.UnitsPerPack * 1.2
,etc
)
) as v(PackCost
,PackPricePlusVAT
,etc
)
and an example of being able to duplicate rows could be in creating an exception report for use in validating data, which will output one row for every DataError condition that the dbo.Product row satisfies:
select p.ProductName
,e.DataError
from dbo.Products as p
cross apply(values('Missing Units Per Pack'
,case when p.SoldInPacks = 1 and isnull(p.UnitsPerPack,0) < 1 then 1 end
)
,('Unusual Price'
,case when p.Price > (p.UnitsPerPack * p.UnitCost) * 2 then 1 end
)
,(etc)
) as e(DataError
,ErrorFlag
)
where e.ErrorFlag = 1
If you can understand what these two scripts are doing, you should find numerous examples of where being able to generate additional values or additional rows of data would be very helpful.

most effecient way to search SQL table using result from same table until final value is found

In the image above which represents an SQL table I would like to search 1111 and retrieve its last replaced number which should be 4444 where 1111 is just a single number replaced by a single number 4444. then i would like to search 5555 which should return (6666,9999,8888). NB 9999 had replaced 7777.
so 1111 was a single part number replaced multiple times and 5555 was a group number with multiple parts breakdown with one replaced number within(7777>>9999).
What would be the fastest and most efficient method?
if possible a solution using SQL for efficiency.
if unable within SQL then from within PHP.
What I have tried:
1) while loop. but need to access database 1000 times for 1000 replaced numbers. ##too inefficient.
2)
SELECT C.RPLPART
FROM TABLE A
left join TABLE B on A.RPLPART=B.PART#
left join TABLE C on B.RPLPART=C.PART#
WHERE A.PART#='1111' ##Unable to know when last number is reached.
A recursive Common Table Expression (CTE) would seem to be the ticket.
Something like so...
with rcte (lvl, topPart, part#, rplpart) as
(select 1 as lvl, part#, part#, rplpart
from MYTABLE
union all
select p.lvl + 1, p.topPart, c.part#, c.rplpart
from rcte p, MYTABLE c
where p.rplpart = c.part#
)
select topPart, rplpart
from rcte
where toppart = 1111
order by lvl desc
fetch first row only;
You can do this using a recursive CTE that generates a complete replacement chain for a given starting part id, and then limit the result to just those that don't exist in the parts# column:
WITH cte(part) AS
(SELECT replpart FROM parts WHERE part# = 1111
UNION ALL
SELECT parts.replpart FROM parts, cte WHERE parts.part# = cte.part)
SELECT DISTINCT part
FROM cte
WHERE part NOT IN (SELECT part# FROM parts);
Fiddle example

where clause with = sign matches multiple records while expected just one record

I have a simple inline view that contains 2 columns.
-----------------
rn | val
-----------------
0 | A
... | ...
25 | Z
I am trying to select a val by matching the rn randomly by using the dbms_random.value() method as in
with d (rn, val) as
(
select level-1, chr(64+level) from dual connect by level <= 26
)
select * from d
where rn = floor(dbms_random.value()*25)
;
My expectation is it should return one row only without failing.
But now and then I get multiple rows returned or no rows at all.
on the other hand,
>>select floor(dbms_random.value()*25) from dual connect by level <1000
returns a whole number for each row and I failed to see any abnormality.
What am I missing here?
The problem is that the random value is recalculated for each row. So, you might get two random values that match the value -- or go through all the values and never get a hit.
One way to get around this is:
select d.*
from (select d.*
from d
order by dbms_random.value()
) d
where rownum = 1;
There are more efficient ways to calculate a random number, but this is intended to be a simple modification to your existing query.
You also might want to ask another question. This question starts with a description of a table that is not used, and then the question is about a query that doesn't use the table. Ask another question, describing the table and the real problem you are having -- along with sample data and desired results.

SQL Query to remove cyclic redundancy

I have a table that looks like this:
Column A | Column B | Counter
---------------------------------------------
A | B | 53
B | C | 23
A | D | 11
C | B | 22
I need to remove the last row because it's cyclic to the second row. Can't seem to figure out how to do it.
EDIT
There is an indexed date field. This is for Sankey diagram. The data in the sample table is actually the result of a query. The underlying table has:
date | source node | target node | path count
The query to build the table is:
SELECT source_node, target_node, COUNT(1)
FROM sankey_table
WHERE TO_CHAR(data_date, 'yyyy-mm-dd')='2013-08-19'
GROUP BY source_node, target_node
In the sample, the last row C to B is going backwards and I need to ignore it or the Sankey won't display. I need to only show forward path.
Removing all edges from your graph where the tuple (source_node, target_node) is not ordered alphabetically and the symmetric row exists should give you what you want:
DELETE
FROM sankey_table t1
WHERE source_node > target_node
AND EXISTS (
SELECT NULL from sankey_table t2
WHERE t2.source_node = t1.target_node
AND t2.target_node = t1.source_node)
If you don't want to DELETE them, just use this WHERE clause in your query for generating the input for the diagram.
If you can adjust how your table is populated, you can change the query you're using to only retrieve the values for the first direction (for that date) in the first place, with a little bit an analytic manipulation:
SELECT source_node, target_node, counter FROM (
SELECT source_node,
target_node,
COUNT(*) OVER (PARTITION BY source_node, target_node) AS counter,
RANK () OVER (PARTITION BY GREATEST(source_node, target_node),
LEAST(source_node, target_node), TRUNC(data_date)
ORDER BY data_date) AS rnk
FROM sankey_table
WHERE TO_CHAR(data_date, 'yyyy-mm-dd')='2013-08-19'
)
WHERE rnk = 1;
The inner query gets the same data you collect now but adds a ranking column, which will be 1 for the first row for any source/target pair in any order for a given day. The outer query then just ignores everything else.
This might be a candidate for a materialised view if you're truncating and repopulating it daily.
If you can't change your intermediate table but can still see the underlying table you could join back to it using the same kind of idea; assuming the table you're querying from is called sankey_agg_table:
SELECT sat.source_node, sat.target_node, sat.counter
FROM sankey_agg_table sat
JOIN (SELECT source_node, target_node,
RANK () OVER (PARTITION BY GREATEST(source_node, target_node),
LEAST(source_node, target_node), TRUNC(data_date)
ORDER BY data_date) AS rnk
FROM sankey_table) st
ON st.source_node = sat.source_node
AND st.target_node = sat.target_node
AND st.rnk = 1;
SQL Fiddle demos.
DELETE FROM yourTable
where [Column A]='C'
given that these are all your rows
EDIT
I would recommend that you clean up your source data if you can, i.e. delete the rows that you call backwards, if those rows are incorrect as you state in your comments.

How to group by a column

Hi I know how to use the group by clause for sql. I am not sure how to explain this so Ill draw some charts. Here is my original data:
Name Location
----------------------
user1 1
user1 9
user1 3
user2 1
user2 10
user3 97
Here is the output I need
Name Location
----------------------
user1 1
9
3
user2 1
10
user3 97
Is this even possible?
The normal method for this is to handle it in the presentation layer, not the database layer.
Reasons:
The Name field is a property of that data row
If you leave the Name out, how do you know what Location goes with which name?
You are implicitly relying on the order of the data, which in SQL is a very bad practice (since there is no inherent ordering to the returned data)
Any solution will need to involve a cursor or a loop, which is not what SQL is optimized for - it likes working in SETS not on individual rows
Hope this helps
SELECT A.FINAL_NAME, A.LOCATION
FROM (SELECT DISTINCT DECODE((LAG(YT.NAME, 1) OVER(ORDER BY YT.NAME)),
YT.NAME,
NULL,
YT.NAME) AS FINAL_NAME,
YT.NAME,
YT.LOCATION
FROM YOUR_TABLE_7 YT) A
As Jirka correctly pointed out, I was using the Outer select, distinct and raw Name unnecessarily. My mistake was that as I used DISTINCT , I got the resulted sorted like
1 1
2 user2 1
3 user3 97
4 user1 1
5 3
6 9
7 10
I wanted to avoid output like this.
Hence I added the raw id and outer select
However , removing the DISTINCT solves the problem.
Hence only this much is enough
SELECT DECODE((LAG(YT.NAME, 1) OVER(ORDER BY YT.NAME)),
YT.NAME,
NULL,
YT.NAME) AS FINAL_NAME,
YT.LOCATION
FROM SO_BUFFER_TABLE_7 YT
Thanks Jirka
If you're using straight SQL*Plus to make your report (don't laugh, you can do some pretty cool stuff with it), you can do this with the BREAK command:
SQL> break on name
SQL> WITH q AS (
SELECT 'user1' NAME, 1 LOCATION FROM dual
UNION ALL
SELECT 'user1', 9 FROM dual
UNION ALL
SELECT 'user1', 3 FROM dual
UNION ALL
SELECT 'user2', 1 FROM dual
UNION ALL
SELECT 'user2', 10 FROM dual
UNION ALL
SELECT 'user3', 97 FROM dual
)
SELECT NAME,LOCATION
FROM q
ORDER BY name;
NAME LOCATION
----- ----------
user1 1
9
3
user2 1
10
user3 97
6 rows selected.
SQL>
I cannot but agree with the other commenters that this kind of problem does not look like it should ever be solved using SQL, but let us face it anyway.
SELECT
CASE main.name WHERE preceding_id IS NULL THEN main.name ELSE null END,
main.location
FROM mytable main LEFT JOIN mytable preceding
ON main.name = preceding.name AND MIN(preceding.id) < main.id
GROUP BY main.id, main.name, main.location, preceding.name
ORDER BY main.id
The GROUP BY clause is not responsible for the grouping job, at least not directly. In the first approximation, an outer join to the same table (LEFT JOIN below) can be used to determine on which row a particular value occurs for the first time. This is what we are after. This assumes that there are some unique id values that make it possible to arbitrarily order all the records. (The ORDER BY clause does NOT do this; it orders the output, not the input of the whole computation, but it is still necessary to make sure that the output is presented correctly, because the remaining SQL does not imply any particular order of processing.)
As you can see, there is still a GROUP BY clause in the SQL, but with a perhaps unexpected purpose. Its job is to "undo" a side effect of the LEFT JOIN, which is duplication of all main records that have many "preceding" ( = successfully joined) records.
This is quite normal with GROUP BY. The typical effect of a GROUP BY clause is a reduction of the number of records; and impossibility to query or test columns NOT listed in the GROUP BY clause, except through aggregate functions like COUNT, MIN, MAX, or SUM. This is because these columns really represent "groups of values" due to the GROUP BY, not just specific values.
If you are using SQL*Plus, use the BREAK function. In this case, break on NAME.
If you are using another reporting tool, you may be able to compare the "name" field to the previous record and suppress printing when they are equal.
If you use GROUP BY, output rows are sorted according to the GROUP BY columns as if you had an ORDER BY for the same columns. To avoid the overhead of sorting that GROUP BY produces, add ORDER BY NULL:
SELECT a, COUNT(b) FROM test_table GROUP BY a ORDER BY NULL;
Relying on implicit GROUP BY sorting in MySQL 5.6 is deprecated. To achieve a specific sort order of grouped results, it is preferable to use an explicit ORDER BY clause. GROUP BY sorting is a MySQL extension that may change in a future release; for example, to make it possible for the optimizer to order groupings in whatever manner it deems most efficient and to avoid the sorting overhead.
For full information - http://academy.comingweek.com/sql-groupby-clause/
SQL GROUP BY STATEMENT
SQL GROUP BY clause is used in collaboration with the SELECT statement to arrange identical data into groups.
Syntax:
1. SELECT column_nm, aggregate_function(column_nm) FROM table_nm WHERE column_nm operator value GROUP BY column_nm;
Example :
To understand the GROUP BY clauserefer the sample database.Below table showing fields from “order” table:
1. |EMPORD_ID|employee1ID|customerID|shippers_ID|
Below table showing fields from “shipper” table:
1. | shippers_ID| shippers_Name |
Below table showing fields from “table_emp1” table:
1. | employee1ID| first1_nm | last1_nm |
Example :
To find the number of orders sent by each shipper.
1. SELECT shipper.shippers_Name, COUNT (orders.EMPORD_ID) AS No_of_orders FROM orders LEFT JOIN shipper ON orders.shippers_ID = shipper.shippers_ID GROUP BY shippers_Name;
1. | shippers_Name | No_of_orders |
Example :
To use GROUP BY statement on more than one column.
1. SELECT shipper.shippers_Name, table_emp1.last1_nm, COUNT (orders.EMPORD_ID) AS No_of_orders FROM ((orders INNER JOIN shipper ON orders.shippers_ID=shipper.shippers_ID) INNER JOIN table_emp1 ON orders.employee1ID = table_emp1.employee1ID)
2. GROUP BY shippers_Name,last1_nm;
| shippers_Name | last1_nm |No_of_orders |
for more clarification refer my link
http://academy.comingweek.com/sql-groupby-clause/