Copying a SQL Server table and adding and rearranging columns - sql

I know that if I want to make a copy of a SQL Server table, I can write a query akin to this:
SELECT *
INTO NewTable
FROM OldTable
But what if I wanted to take the contents of OldTable that may look like this:
| Column1 | Column2 | Column3 |
|---------|---------|---------|
| 1 | 2 | 3 |
| 4 | 5 | 6 |
| 7 | 8 | 9 |
and make a copy of that table but have the new table look like this:
| Column1 | Column3 | Column2 | Column4 | Column5 |
|--------- |--------- |--------- |--------- |--------- |
| 1 | 3 | 2 | 10 | 11 |
| 4 | 6 | 5 | 12 | 13 |
| 7 | 9 | 8 | 14 | 15 |
So now I've swapped Columns 2 and 3 and added Column 4 and Column 5. I don't need to have a query that will add that data to the columns, just the bare columns.

It's a matter of modifying your select statement. SELECT * takes only the columns from the source table, in their order. You want something different - so SELECT it.
SELECT * INTO NewTable
FROM OldTable
->
SELECT Col1, col3, col2, ' ' AS col4, ' ' AS col5
INTO NewTable
FROM OldTable
This gives you very little flexibility as far as how the table's columns are specced and indices and such - so it's probably a bad idea, probably better to do this another way (properly CREATE TABLE), but if you need quick and dirty, I suppose...

You can just name the columns:
Select
[Column1], [Column3], [Column2], Cast(null as bigint) as [Column4], 0 as [Column5]
Into CopyTable
From YourTable
Just like any query, it is always preferable to use the Column names and avoid using *.
You can then add any value as [ColumnX] in the select.
You can use a cast to get the type you want in the new table.

Related

SQL - Ordering second column based on the first column

I am trying to retrieve data from a table, but I need it to be ordered in a very specific way and I'm not sure if it's possible using Oracle SQL alone.
What I need to do is retrieve all of the rows, but order it in a way that where column 3 is null (indicated by a blank space in the graphs below) those are ordered first. Then, all the rows that aren't null in column 3 would be shown AFTER the row that has their column value in column 1.
What I have:
+------+-------+------+
| Col1 | Col2 | Col3 |
+------+-------+------+
| 1 | text | |
| 2 | text | 1 |
| 3 | text | 1 |
| 8 | text | 10 |
| 9 | text | 10 |
| 10 | text | |
+------+-------+------+
What I would like as a result:
+------+-------+------+
| Col1 | Col2 | Col3 |
+------+-------+------+
| 1 | text | |
| 2 | text | 1 |
| 3 | text | 1 |
| 10 | text | |
| 8 | text | 10 |
| 9 | text | 10 |
+------+-------+------+
What I have tried:
First thing I tried was using:
ORDER BY coalesce(Col3, Col1)
and it got me close to the result, but the Col1 value 10 needs to be shown before the Col3 value 10.
+------+-------+------+
| Col1 | Col2 | Col3 |
+------+-------+------+
| 1 | text | |
| 2 | text | 1 |
| 3 | text | 1 |
| 8 | text | 10 |
| 9 | text | 10 |
| 10 | text | |
+------+-------+------+
I've also tried creating a new column where if Col3 is null then Col4 is true and false other wise, but this was essentially the same thing as coalesce up above.
I also tried just running some basic order by's but had no success in achieving this.
In Oracle, you would just use nulls first:
order by coalesce(col3, col1), col3 nulls first, col1
Your table looks very much like hierarchical data, where in some sense col1 is a unique row identifier, and col3 points to a row's parent row.
If so, it may be better to use a hierarchical query (connect by) for this. The ordering is hierarchical, and siblings (descendants from the same parent) are ordered according to the order siblings by clause.
Like this:
with
sample_table(col1, col2, col3) as (
select 1, 'text', null from dual union all
select 2, 'text', 1 from dual union all
select 3, 'text', 1 from dual union all
select 8, 'text', 10 from dual union all
select 9, 'text', 10 from dual union all
select 10, 'text', null from dual
)
select *
from sample_table
start with col3 is null
connect by col3 = prior col1
order siblings by col1
;
COL1 COL2 COL3
---------- ---- ----------
1 text
2 text 1
3 text 1
10 text
8 text 10
9 text 10
The with clause is not part of the solution - I added it there so I can test the query. (Remember this "with clause" way to create sample tables for testing - you can include them yourself, instead of the formatted table in your original question, so that people can easily test their answers on your sample data.)

Oracle SQL statement without duplicates

I have a requirement to write a SQL statement to return 2 columns, however there cannot be duplicates in either of these columns. For example:
|---------------------|------------------|
| 10 | A |
|---------------------|------------------|
| 11 | B |
|---------------------|------------------|
| 12 | C |
|---------------------|------------------|
| 13 | A | <--- Don't return
|---------------------|------------------|
Using distinct doesn't work, since the row highlighted above is distinct. It also doesn't matter which of the duplicates is returned.
Does anyone know of a way to do this? It feels as though I'm missing something obvious.
Thanks.
You can try to make row number by col2 and get rn = 1 data row.
CREATE TABLE T(
col1 int,
col2 varchar(5)
);
insert into t values (10,'A');
insert into t values (11,'B');
insert into t values (12,'C');
insert into t values (13,'A');
Query 1:
SELECT t1.col1,t1.col2
FROM (
SELECT t1.*,ROW_NUMBER() OVER(PARTITION BY col2 ORDER BY col1) rn
FROM T t1
)t1
WHERE t1.rn = 1
Results:
| COL1 | COL2 |
|------|------|
| 10 | A |
| 11 | B |
| 12 | C |
If you just want the lowest value from the first column, do:
SELECT MIN(column1), column2
FROM YourTable
GROUP BY column2
This is not posible in one query, because each column have different number of unique values

Oracle group by only ONE column

I have a table in Oracle database, which have 40 columns.
I know that if I want to do a group by query, all the columns in select must be in group by.
I simply just want to do:
select col1, col2, col3, col4, col5 from table group by col3
If I try:
select col1, col2, col3, col4, col5 from table group by col1, col2, col3, col4, col5
It does not give the required output.
I have searched this, but did not find any solution. All the queries that I found using some kind of Add() or count(*) function.
In Oracle is it not possible to simply group by one column ?
UPDATE:
My apologies, for not being clear enough.
My Table:
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 1 | 1 | some text 1 | 100 |
| 2 | 1 | some text 1 | 200 |
| 3 | 2 | some text 1 | 200 |
| 4 | 3 | some text 1 | 78 |
| 5 | 4 | some text 1 | 65 |
| 6 | 5 | some text 1 | 101 |
| 7 | 5 | some text 1 | 200 |
| 8 | 1 | some text 1 | 200 |
| 9 | 6 | some text 1 | 202 |
+--------+----------+-------------+-------+
and by running following query:
select col1, col2, col3 from table where col3='200' group by col1;
I will get the following desired Output:
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 |
| 3 | 2 | some text 1 | 200 |
| 7 | 5 | some text 1 | 200 |
+--------+----------+-------------+-------+
Long comment here;
Yeah, you can't do that. Think about it... If you have a table like so:
Col1 Col2 Col3
A A 1
B A 2
C A 3
And you're grouping by only Col2, which will group down to a single row... what happens to Col1 and Col3? Both of those have 3 distinct row values.
How is your DBMS supposed to display those?
Col1 Col2 Col3
A? A 1?
B? 2?
C? 3?
This is why you have to group by all columns, or otherwise aggregate or concatenate them. (SUM(),MAX(), MIN(), etc..)
Show us how you want the results to look and I'm sure we can help you.
Edit - Answer:
First off, thanks for updating your question. Your query doesn't have id but your expected results do, so I will answer for each separately.
Without id
You will still need to group by all columns to achieve what you're going for. Let's walk through it.
If you run your query without any group by:
select col1, col2, col3 from table where col3='200'
You will get this back:
+----------+-------------+-------+
| col1 | col2 | col3 |
+----------+-------------+-------+
| 1 | some text 1 | 200 |
| 2 | some text 1 | 200 |
| 5 | some text 1 | 200 |
| 1 | some text 1 | 200 |
+----------+-------------+-------+
So now you want to only see the col1 = 1 row once. But to do so, you need to roll all of the columns up, so your DBMS knows what do to with each of them. If you try to group by only col1, you DBMS will through an error because you didn't tell it what to do with the extra data in col2 and col3:
select col1, col2, col3 from table where col3='200' group by col1 --Errors
+----------+-------------+-------+
| col1 | col2 | col3 |
+----------+-------------+-------+
| 1 | some text 1 | 200 |
| 2 | some text 1 | 200 |
| 5 | some text 1 | 200 |
| ? | some text 1?| 200? |
+----------+-------------+-------+
If you group by all 3, your DBMS knows to group together the entire rows (which is what you want), and will only display duplicate rows once:
select col1, col2, col3 from table where col3='200' group by col1, col2, col3
+----------+-------------+-------+
| col1 | col2 | col3 |
+----------+-------------+-------+
| 1 | some text 1 | 200 |
| 2 | some text 1 | 200 | --Desired results
| 5 | some text 1 | 200 |
+----------+-------------+-------+
With id
If you want to see id, you will have to tell your DBMS which id to display. Even if we group by all columns, you won't get your desired results, because the id column will make each row distinct (They will no longer group together):
select id, col1, col2, col3 from table where col3='200' group by id, col1, col2, col3
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 | --id = 2
| 3 | 2 | some text 1 | 200 |
| 7 | 5 | some text 1 | 200 |
| 8 | 1 | some text 1 | 200 | --id = 8
+--------+----------+-------------+-------+
So in order to group these rows, we need to explicitly say what to do with the ids. Based on your desired results, you want to choose id = 2, which is the minimum id, so let's use MIN():
select MIN(id), col1, col2, col3 from table where col3='200' group by col1, col2, col3
--Note, MIN() is an aggregate function, so id need not be in the group by
Which returns your desired results (with id):
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 |
| 3 | 2 | some text 1 | 200 |
| 7 | 5 | some text 1 | 200 |
+--------+----------+-------------+-------+
Final thought
Here were your two trouble rows:
+--------+----------+-------------+-------+
| id | col1 | col2 | col3 |
+--------+----------+-------------+-------+
| 2 | 1 | some text 1 | 200 |
| 8 | 1 | some text 1 | 200 |
+--------+----------+-------------+-------+
Any time you hit these, just think about what you want each column to do, one at a time. You will need to handle all columns any time you do grouping or aggregates.
id, you only want to see id = 2, which is the MIN()
co1, you only want to see distinct values, so GROUP BY
col2, you only want to see distinct values, so GROUP BY
col3, you only want to see distinct values, so GROUP BY
maybe analytic functions is what you need
try smth like this:
select col1, col2, col3, col4, col5
, sum(*) over (partition by col1) as col1_summary
, count(*) over () as total_count
from t1
if you google the article - you find thousands on examples
for example this
Introduction to Analytic Functions (Part 1)
Why do you want to GROUP BY , wouldn't you want to ORDER BY instead?
If you state an English language version of the problem you are trying to solve (i.e. the requirements) it would be easier to be more specific.
I guess,maybe you need upivot function
or post your specific final result you want
select col3, col_group
from table
UNPIVOT ( col_group for value in ( col1,col2,col4,col5))
SELECT * FROM table
WHERE id IN (SELECT MIN(id) FROM table WHERE col3='200' GROUP BY col1)

SQL: Use distinct on groups of similar data

Hello Mates I have the following problem in a Vertica database: I have a large Table
+------+------+------+
| Date | Col1 | Col2 |
+------+------+------+
| 1 | A | B |
| 2 | A | B |
| 3 | D | E |
| 2 | C | D |
| 1 | C | D |
+------+------+------+
As you can see I have redundant data, just taken on different dates (row 1 & 2 and row 4 & 5). So I would like a table that removes that redundant data by deleting the rows with the lower date, giving me a result like that:
+------+------+------+
| Date | Col1 | Col2 |
+------+------+------+
| 2 | A | B |
| 2 | C | D |
| 3 | D | E |
+------+------+------+
Using distinct would not work since it will delete rows randomly not considering the date, so I might end up with a table like this:
SELECT DISTINCT Col2, Col3 from Table
+------+------+------+
| Date | Col1 | Col2 |
+------+------+------+
| 2 | A | B |
| 1 | C | D |
| 3 | D | E |
+------+------+------+
which is not desired.
Is there anyway to accomplish that?
Thanks mates
Do a GROUP BY on your 2 columns and aggregate on the highest date:
SELECT MAX(Date), col1, col2
FROM table
GROUP BY Col1, Col2
I'm just generalizing the patterns here and adding one, for the exact question asked any of these methods would probably work, the devil is in the details.
The aggregate method proposed by #Thomas_G works because you only have 1 column outside the grouping. If you had two it could mix/match (some data from one row, some from another) which is not likely what you want as a duplicate handling strategy.
The analytical method proposed by #Gordon_Linoff is good, but be aware that if the date is duplicated in the source data, then you'll get multiple rows if they exist on the max date. This might be what you want, but maybe not.
Another method is to just peel off the top row in the window. It will choose the first row in the partition based on your window ordering. If there are multiples dates at the max, then you can't guarantee which one will be chosen unless you include something more in the window order. But at least you know you'll only get one row, for what it's worth.
select t.*
from (select t.*, row_number() over (partition by col1, col2 order by date desc) as rn
from t
) t
where rn = 1;
If there are other columns that you care about, you can use window functions:
select t.*
from (select t.*, max(date) over (partition by col1, col2) as maxd
from t
) t
where date = maxd;

merging content of two tables without duplicating content

I have two identical SQL Server tables (SOURCE and DESTINATION) with lots a columns in each. I want to insert into table DESTINATION rows from table SOURCE that do not already exist in table DESTINATION. I define equality between the two rows if all columns match except for the timestamp, a count column, and the integer primary key. So I want to insert into DESTINATION all rows in SOURCE that dont already exist in DESTINATIONignoring count, timestamp, and the primary key columns.
How do I do this?
Thanks for all the contributions! I chose to use the Merge command since it is structured to allow for updates and inserts in one statement and I needed to do the update separately.
this is the code that worked:
Merge
into DESTINATION as D
using SOURCE as S
on (
D.Col1 = S.Col1
and D.Col2 = S.Col2
and D.Col3 = S.Col3
)
WHEN MATCHED
THEN UPDATE SET D.Count = S.Count
WHEN NOT MATCHED THEN
INSERT (Col1, Col2, Col3, Count, timestamp)
VALUES (S.Col1, S.Col2, S.Col3, S.Count, S.timestamp);
note: when I wrote this question first I called the tables AAA and BBB. I edited and changed the names of AAA to SOURCE AND BBB to DESTINATION for clarity
using Select statement for this purpose since Sql Server 2008 is obsolete instead of Select You can use Merge statement :
ref:
http://technet.microsoft.com/en-us/library/bb510625.aspx
http://weblogs.sqlteam.com/peterl/archive/2007/09/20/Example-of-MERGE-in-SQL-Server-2008.aspx
Something like this:
INSERT INTO BBB(id, timestamp, mycount, col1, col2, col3, etc.)
SELECT id, timestamp, mycount, col1, col2, col3, etc.
FROM AAA
WHERE
NOT EXISTS(SELECT NULL FROM BBB oldb WHERE
oldb.col1 = AAA.col1
AND oldb.col2 = AAA.col2
AND oldb.col3 = AAA.col3
)
Add columns as needed to the NOT EXISTS clause.
A solution using good ol'-fashioned LEFT JOIN -- note in the example below, only the first row of BBB is inserted into AAA, because only it has no matching row in AAA. You'd replace col1 and col2 with the actual columns of the tables.
> select * from AAA;
+---------------------+------+------+
| timestamp | col1 | col2 |
+---------------------+------+------+
| 2012-03-17 08:17:22 | 1 | 1 |
| 2012-03-17 08:17:27 | 1 | 2 |
| 2012-03-17 08:17:30 | 1 | 3 |
| 2012-03-17 08:17:32 | 1 | 4 |
| 2012-03-17 08:17:49 | 2 | 2 |
| 2012-03-17 08:17:52 | 2 | 3 |
| 2012-03-17 08:17:54 | 2 | 4 |
+---------------------+------+------+
7 rows in set (0.00 sec)
> select * from BBB;
+---------------------+------+------+
| timestamp | col1 | col2 |
+---------------------+------+------+
| 2012-03-17 08:18:16 | 2 | 1 |
| 2012-03-17 08:18:18 | 2 | 2 |
| 2012-03-17 08:18:20 | 2 | 3 |
+---------------------+------+------+
3 rows in set (0.00 sec)
> INSERT INTO AAA
SELECT BBB.* FROM BBB
LEFT JOIN AAA
USING(col1,col2)
WHERE AAA.timestamp IS NULL;
> select * from AAA;
+---------------------+------+------+
| timestamp | col1 | col2 |
+---------------------+------+------+
| 2012-03-17 08:17:22 | 1 | 1 |
| 2012-03-17 08:17:27 | 1 | 2 |
| 2012-03-17 08:17:30 | 1 | 3 |
| 2012-03-17 08:17:32 | 1 | 4 |
| 2012-03-17 08:17:49 | 2 | 2 |
| 2012-03-17 08:17:52 | 2 | 3 |
| 2012-03-17 08:17:54 | 2 | 4 |
| 2012-03-17 08:18:16 | 2 | 1 |
+---------------------+------+------+
8 rows in set (0.00 sec)