DISTINCT for large table

DISTINCT for large table - sql

I have got a very large table and in one column I have some strings like TypeA, TypeB, etc I would like to do a query with CASE operator using that column
CASE WHEN col1 = 'TypeA' Then '25'
WHEN col1 = 'TypeB' Then '28'
...
WHEN col1 = '????' Then '15'
END
but I do not know how many unique values that column has and what are they (they are words/sentences up to 3 words).
I know I could find those unique values by
SELECT DISTINCT col1 FROM table1
or
SELECT col1 FROM table1 GROUP BY col1
but due to the size of table it's executing endlessly
Can I do it in efficient way? I want to find all unique values just from 1 column

It seems one should better create a table with unique values. Then you can join on that table, and the value domain is open-ended. You can then replace the field with a reference to the value table.
As there does not seem to exist an index on col1 DISTINCT on the original table is slow. Have an index / primary key on the col1 of the value table.

Related

SQL - get only colums from a table where not all values are nulls

SQL question:
How do I get all column values from columns where not all values are null?
Table A
COL1 COL2 COL3 COL4 COL5
---------------------------------------
abc 1 NULL NULL NULL
def 2 NULL testA NULL
NULL 3 NULL testB NULL
jkl 4 NULL NULL NULL
I want to get
COL1 COL2 COL4
-----------------------
abc 1 NULL
def 2 testA
NULL 3 testB
jkl 4 NULL
Is there a sql or plsql solution achieve this this?
To avoid answers that are irrelevant: assume I have a million rows.
I want the result to be a view or a result table.
Not a written output.
I found a similar question, but it does not satisfy my need:
How to select columns from a table which have non null values?
The column names can be quickly grabbed through this query
select column_name
from all_tab_columns
where lower(table_name)='tableA' and num_distinct > 0;
I understand I could create a script with a cursor and then loop through it, adding the result to a new table or view.
This is not what I need. I wondered if this could be done using a single query, using pivot/unpivot or something else.

What you are asking for is not possible in plain SQL, unless you know ahead of time which columns only have NULL everywhere. (It seems that you don't want to assume that you know that.)
Which columns are included in the output - how many columns, their names, and in what order they appear - must be hard-coded in the SELECT clause, it can't be determined at runtime. On the other hand, you will only know which columns are all-NULL only after reading the data (meaning, at runtime) - or else you must have that information from an external source.
The only way to do what you seem to want to do is with dynamic SQL. That is an advanced topic, and a technique generally considered a poor business practice.
WHY do you not want to show columns with all-NULL values? Are you sure that requirement is meaningful?

try these steps, it may help:
Create table temp as (Select * from TableA)
Declare NbrRows Number(10);
plsql_block VARCHAR2(1000);
CountNullRows Number (10)
Select count(*) as nbr
into NbrRows
from TableA
Select count(COL1) as nbr
into CountNullRows
from TableA where COL1 is null
if (NbrRows = CountNullRows) then
Alter table Temp drop column COL1
endif
Select count(COL2) as nbr
into CountNullRows
from TableA where COL2 is null
if (NbrRows = CountNullRows) then
Alter table Temp drop column COL2
endif
Select count(COL3) as nbr
into CountNullRows
from TableA where COL3 is null
if (NbrRows = CountNullRows) then
Alter table Temp drop column COL3
endif
...etc...
Do the same thing for all your columns
You have the desired result in the Tem table.

Reject a row based on 2 column values

Below is the output of a simple join query. All the 3 columns are from different tables.
Col1 Col2 Col3
Manual Y-Yes Include
MC Y-Yes Include
Manual Y-Yes Exclude
Manual Y-Yes Exclude
I need to get the rows with 'Include' only if there is no 'Exclude' for the same Col1 value.
If there is no 'Exclude' for the Col1 value, then its fine to display 'Include'.
So the query should not display the first row according to the requirement since the Col1 value 'Manual' has 'Exclude'.

Your sql query should look a lot like what your question would be in English:
You want all the rows where there is no row for the same col1 value that has 'Exclude' in the col3 value, right?
I cannot give exact sql since you do not provide table or column names, but if all three columns were in the same table, it would look like this:
Select * from mytable
where not exists
(select * from mytable
where col1 = t.col1
and col3 = 'Exclude')

SQL: How to update an empty column with pre-defined set of values

I have a table with, let's say, 100 records. The table has two columns. The first column (A) has unique values. The second column (B) has NULL values
For 4 elements from column A I'd like to associate some earlier defined values, and they are unique as well.
I don't care about which value from column B will be associated with the value from column A. I'd like to associate 4 unique values with another 4 unique values. Basically, like I'd cut and paste a block of values from one column to another in excel.
How can I do it without using cursors?
I'd like to use one Update statement for ALL rows instead one Update statement for EVERY row as I do now.

Try this:
UPDATE t
SET ColumnB = BValue
FROM Table t
INNER JOIN
(
SELECT 1 AValue, 'Mouse' BValue UNION
SELECT 2, 'Cat' UNION
SELECT 3, 'Dog' UNION
SELECT 4, 'Wolf'
) PreDefined ON(t.ColumnA = PreDefined.AValue)
Use any number you want in the 'PreDefined' table, as long as they are unique and within the range of values in columnA of your original table.

If you are only trying to fill a table for testing purposes, I guess you could:
A) Use the value from Column A itself (as it is already unique).
B) If they are to be different, use some function on the column A's value to obtain a column B value (something simple, like (ColumnA * 10), and this would give youA)
C) Create a temp table with a "dictionary" setting a B value for each possible A value, and then update the rows desired on your table looking up from values on this dictionary table.
Anyway, if you explain a little further your purpose it will be easier to try suggesting you a solution.

if your animal data is already in a database table, then you can use a single update statement like this:
update target_table t4
set columnb = (
select animal_name
from (select columna, animal_name
from (select rownum rowNumber, animal_name from animal_table) t1
join (select rownum rowNumber, columna from target_table t1 where columnb is null) t2
on t1.rowNumber = t2.rowNumber
) t3
where t4.columna = t3.columna
)
;
this works by selecting a sequence number and animal name from the source table, then selecting a sequence number and columna value from your target table. by joining those records on the sequence number you guarantee you get exactly 1 animal name for each columna value. you can then join those columna-to-animal records to your target table to do an update of columnb.
for more background on updating one table from values in another, you might consider the solutions presented here: Update rows in one table with data from another table based on one column in each being equal. the only difference is that in your example, you do not have any column that matches between your target table and your animal names table, so you need to use the rownum to create an arbitrary 1-to-1 matching of records.
if your unique options are in a text file or spreadsheet, then you can format them into a fixed-width space-padded string and pick the one you want using the rownum index like so:
update table_name
set columnb = trim(substr('mouse cat dog wolf ', rownum*6-6, 6))
where columnb is null;

Oracle Compare data between two different table

I have two table one is having all field VARCHAR2 but other having different type for different data.
For Example :
Table One
==========================
Col 1 VARCHAR2 UNIQUE KEY
Col 2 VARCHAR2
Col 3 VARCHAR2
===========================
Table Two
==========================
Col One VARCHAR2 UNIQUE KEY
Col Two TIMESTAMP
Col Three NUMBER
==========================
we are having one mapping table. it denotes which column of Table One has to compare with which column of Table Two.
For Example
Mapping Table
==============================
Table One Table Two
==============================
Col 1 Col One
Col 2 Col Three
Col 3 Col Two
==============================
Now with the help of UNIQUE KEY of TABLE ONE we have to find same row in TABLE TWO and compare rows column by column and get changes in data.
Currently we are using java program for comparing data row by row and column by column and getting changes between data in rows with same UNIQUE KEY. it is working fine but taking too much time as we are having 100000 records in DB.
Now my question is : is there any way i can compare data at SQL level and get changes in data?

You can do it 'manually' with a query like this: It's a lot of work, but there are only three different types of checks you need to do, so it's not very complex:
select
*
from
Table1 t1
full outer join Table2 t2 on t2.ID = t1.ID
where
-- Check ID, either record does not exist in either table.
t1.ID is null or
t2.ID = null or
-- Not nullable field can be easily compared.
t1.NotNullableField1 <> t2.NotNUllableField1 or
-- Nullable field is slightly more work.
t1.NullableField1 <> t2.NullableField1 or
(t1.NullableField1 is null and t2.NullableField1 is not null) or
(t1.NullableField1 is not null and t2.NullableField1 is null)
Another solution is to use MINUS, which is a bit like UNION, only it returns a dataset minus the records in a second dataset:
select * from Table1 t1
MINUS
select * from Table2 t2
This works only one way (which might be fine for your purpose), but you can also combine it with UNION to make it bidirectional.
select
*
from
( select * from Table1
MINUS
select * from Table2)
UNION ALL
( select * from Table2
MINUS
select * from Table1)
The output of both solutions is a bit different.
In the FULL OUTER JOIN query, the IDs will be joined and the values of the matching rows will be displayed next to each other as a single row.
In the MINUS query, the result will be presented as a single dataset. If a record does not exist in either one table, it will be displayed. If a record (ID) exists in both tables, but other fields are different, you will get both rows. So it's a bit harder to compare them.
See: http://www.techonthenet.com/oracle/minus.php

Matching records with wild cards from two different tables

I have two tables with the following data (amongst other data).
Table 1
Value 1
'003232339639
'00264644106272
0026461226291#
I need to match the second column in the table below using column 1 as an identifier
Table 2
Value 1 Value 2
00264 1
0026485 2
0026481 3
00322889 4
00323283 5
00323288 6
So the results I need will be as follows:
Result
Table 1, Value 1 Table 2, Value 2
'003232339639......4
'00264644106272....1
0026461226291#.....1
Any help will be appreciated - very stuck here and doing it manually at the moment in excel.
I hope this format makes sense - first time I am using this forum.

Melany, the question is kind of confusing (not written correctly) perhaps that's why no one is responding. I'll make an attempt to explain how similar selects is done
SELECTING DATA FROM TABLE1 WHERE A MATCHING COLUMN (COL1) EXISTS IN BOTH TABLE
SELECT * FROM TABLE1
INNER JOIN TABLE2
ON TABLE1.COL1 = TABLE2.COL1
AND TABLE1.COL1 = 'XYZ'
USING A SUBSELECT FOR THE SAME
SELECT * FROM TABLE1
WHERE COL1 IN(SELECT COL1 FROM TABLE2
WHERE COL1 = 'XYZ')

In SQL, the wildcard for one or more characters is %, and is to be used with the keyword LIKE.
So I suggest the following (if your purpose is really to match rows in Table1 for which Value1 begins like a value in Table2.Value1):
SELECT Table1.Value1, Table2.Value2 WHERE Table1.Value1 LIKE CONCAT(Table2.Value1, '%');
Edit: replace CONCAT(x, y) with x || y for some DBMSs (SQLite for instance).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas