I have user table with 100+ columns where some of the columns has null values depending on user type in the table. I am looking for SQL query to list the rows which has values in columns for particular user type. I have seen certain examples but those queries has specific column name in condition. In my situation, i have many columns that can be null. Can some one help me with the query?
Thanks
If you have Oracle Statistics enabled, (i belive they are always enabled in 11g), you can use the NUM_DISTINCT column of user_tab_columns:
Generate testing table and populate:
CREATE TABLE TESTING (
JOE VARCHAR2(200),
FREDDY VARCHAR2(200),
CAR NUMBER );
INSERT INTO TESTING VALUES ('1', 'x', NULL);
INSERT INTO TESTING VALUES ('b', '2', NULL);
COMMIT;
ANALYZE TABLE TESTING COMPUTE STATISTICS;
The Query you need:
SELECT * FROM USER_TAB_COLUMNS WHERE TABLE_NAME = 'TESTING' AND NUM_DISTINCT <> 0
Related
We are loading data into a fact table, we the original temporary table on Snowflake looks like the following:
Where indicator_nbr fields are questions asked within a survey.
We are using data modelling techniques in building our warehouse database, so the data will be added into a fact table like so:
Then the same for the indicator 2 and 3 and so on if there is other questions.
Each Field with its value will be as a single row. Of course there is other metadata to be added like load_dt and record_src but they are not a problem.
The current script is doing the following:
Get the fields into an array => fields_array = ['indicator_1', 'indicator_2', 'indicator_3']
A loop will run over the array and start adding each field with its value for each row. So imagine we are having 100 rows, we will run 300 inserts, one at a time:
for (var col_num = 0; col_num<fields_array.length; col_num = col_num+1) {
var COL_NAME = fields_array[col_num];
var field_value_query = "INSERT INTO SAT_FIELD_VALUE SELECT md5(id), CURRENT_TIMESTAMP(), NULL, 'SRC', "+COL_NAME+", md5(foreign_key_field) FROM "+TEMP_TABLE_NAME+"";
}
As mentioned in the comment on this post showing the full script, it is better to loop over a string concatenating each from values of the insert query.
There is 2 issues of the suggested solution:
There is a size limit of a query on Snowflake (it should be less than 1 MB);
if we are going to loop over each field and concatenate the from values, we should do a select query as well from the temp table to get the value of the column, so there will be no optimization, or we will reduce the time a little bit but not to much.
EDIT: A possible solution
I was thinking of doing an sql query selecting everything from the temp table, and do hashing and everything and save it into an array after transposing, but I have no idea how to do it.
Not sure if this is what you're looking for but it seems as though you just want to do a pivot:
Setup example scenario
create or replace transient table source_table
(
id number,
indicator_1 varchar,
indicator_2 number,
indicator_3 varchar
);
insert overwrite into source_table
values (1, 'Test', 2, 'DATA'),
(2, 'Prod', 3, 'DATA'),
(3, 'Test', 1, 'METADATA'),
(4, 'Test', 1, 'DATA')
;
create or replace transient table target_table
(
hash_key varchar,
md5 varchar
);
Run insert
select
name_col as hash_key,
md5(id)
from (select
id,
indicator_1,
indicator_2::varchar as indicator_2,
indicator_3
from source_table) unpivot ( val_col for name_col in (indicator_1, indicator_2, indicator_3))
;
This results in a target_table that looks like this:
+-----------+--------------------------------+
|HASH_KEY |MD5 |
+-----------+--------------------------------+
|INDICATOR_1|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_2|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_3|c4ca4238a0b923820dcc509a6f75849b|
|INDICATOR_1|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_2|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_3|c81e728d9d4c2f636f067f89cc14862c|
|INDICATOR_1|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_2|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_3|eccbc87e4b5ce2fe28308fd9f2a7baf3|
|INDICATOR_1|a87ff679a2f3e71d9181a67b7542122c|
|INDICATOR_2|a87ff679a2f3e71d9181a67b7542122c|
|INDICATOR_3|a87ff679a2f3e71d9181a67b7542122c|
+-----------+--------------------------------+
It is great scenario to use INSERT ALL:
INSERT ALL
INTO dst_tab(hash_key, md5) VALUES (indicator_1, md5)
INTO dst_tab(hash_key, md5) VALUES (indicator_2, md5)
INTO dst_tab(hash_key, md5) VALUES (indicator_3, md5)
SELECT MD5(id) AS md5, indicator_1, indicator_2::STRING AS indicator_2, indicator_3
FROM src_tab;
I am using Oracle SQL developer, We are loading tables with data and I need to validate if all the tables are populated and if there are any columns that are completely null(all the rows are null for that column).
For tables I am clicking each table and looking at the data tab and finding if the tables are populated and then have looking through each of the columns using filters to figure out if there are any completely null columns. I am wondering if there is faster way to do this.
Thanks,
Suresh
You're in luck - there's a fast and easy way to get this information using optimizer statistics.
After a large data load the statistics should be gathered anyway. Counting NULLs is something the statistics gathering already does. With the default settings since 11g, Oracle will count the number of NULLs 100% accurately. (But remember that the number will only reflect that one point in time. If you add data later, the statistics must be re-gathered to get newer results.)
Sample schema
create table test1(a number); --Has non-null values.
create table test2(b number); --Has NULL only.
create table test3(c number); --Has no rows.
insert into test1 values(1);
insert into test1 values(2);
insert into test2 values(null);
commit;
Gather stats and run a query
begin
dbms_stats.gather_schema_stats(user);
end;
/
select table_name, column_name, num_distinct, num_nulls
from user_tab_columns
where table_name in ('TEST1', 'TEST2', 'TEST3');
Using the NUM_DISTINCT and NUM_NULLS you can tell if the column has non-NULLs (num_distinct > 0), NULL only (num_distinct = 0 and num_nulls > 0), or no rows (num_distinct = 0 and num_nulls = 0).
TABLE_NAME COLUMN_NAME NUM_DISTINCT NUM_NULLS
---------- ----------- ------------ ---------
TEST1 A 2 0
TEST2 B 0 1
TEST3 C 0 0
Certainly. Write a SQL script that:
Enumerates all of the tables
Enumerates the columns within the tables
Determine a count of rows in the table
Iterate over each column and count how many rows are NULL in that column.
If the number of rows for the column that are null is equal to the number of rows in the table, you've found what you're looking for.
Here's how to do just one column in one table, if the COUNT comes back as anything higher than 0 - it means there is data in it.
SELECT COUNT(<column_name>)
FROM <table_name>
WHERE <column_name> IS NOT NULL;
This query return that what you want
select table_name,column_name,nullable,num_distinct,num_nulls from all_tab_columns
where owner='SCHEMA_NAME'
and num_distinct is null
order by column_id;
Below script you can use to get empty columns in a table
SELECT column_name
FROM all_tab_cols
where table_name in (<table>)
and avg_col_len = 0;
I am using Access 2013 and I am trying to insert rows to a table but I don't want any duplicates. Basically if not exists in table enter the data to table. I have tried to using 'Not Exists' and 'Not in' and currently it still does not insert to table. Here is my code if I remove the where condition then it inserts to table but If I enter same record it duplicates. Here is my code:
INSERT INTO [UB-04s] ( consumer_id, prov_id, total_charges, [non-covered_chrgs], patient_name )
VALUES ([Forms]![frmHospitalEOR]![client_ID], [Forms]![frmHospitalEOR]![ID], Forms![frmHospitalEOR].[frmItemizedStmtTotals].Form.[TOTAL BILLED], Forms![frmHospitalEOR].[frmItemizedStmtTotals].Form.[TOTAL BILLED], [Forms]![frmHospitalEOR]![patient_name])
WHERE [Forms]![frmHospitalEOR]![ID]
NOT IN (SELECT DISTINCT prov_id FROM [UB-04s]);
You cannot use WHERE in this kind of SQL:
INSERT INTO tablename (fieldname) VALUES ('value');
You can add a constraint to the database, like a unique index, then the insert will fail with an error message. It is possible to have multiple NULL values for several rows, the unique index makes sure that rows with values are unique.
To avoid these kind of error messages you can build a procedure or use code to check data first, and then perform some action - like do the insert or cancel.
This select could be used to check data:
SELECT COUNT(*) FROM [UB-04s] WHERE prov_id = [Forms]![frmHospitalEOR]![ID]
It will return number of rows with the spesific value, if it is 0 then you are redy to run the insert.
In DB2 I can do a command that looks like this to retrieve information from the inserted row:
SELECT *
FROM NEW TABLE (
INSERT INTO phone_book
VALUES ( 'Peter Doe','555-2323' )
) AS t
How do I do that in Postgres?
There are way to retrieve a sequence, but I need to retrieve arbitrary columns.
My desire to merge a select with the insert is for performance reasons. This way I only need to execute one statement to insert values and select values from the insert. The values that are inserted come from a subselect rather than a values clause. I only need to insert 1 row.
That sample code was lifted from Wikipedia Insert Article
A plain INSERT ... RETURNING ... does the job and delivers best performance.
A CTE is not necessary.
INSERT INTO phone_book (name, number)
VALUES ( 'Peter Doe','555-2323' )
RETURNING * -- or just phonebook_id, if that's all you need
Aside: In most cases it's advisable to add a target list.
The Wikipedia page you quoted already has the same advice:
Using an INSERT statement with RETURNING clause for PostgreSQL (since
8.2). The returned list is identical to the result of a SELECT.
PostgreSQL supports this kind of behavior through a returning clause in a common table expression. You generally shouldn't assume that something like this will improve performance simply because you're executing one statement instead of two. Use EXPLAIN to measure performance.
create table test (
test_id serial primary key,
col1 integer
);
with inserted_rows as (
insert into test (c1) values (3)
returning *
)
select * from inserted_rows;
test_id col1
--
1 3
Docs
I switch between Oracle and SQL Server occasionally, and often forget how to do some of the most trivial tasks in SQL Server. I want to manually insert a row of data into a SQL Server database table using SQL. What is the easiest way to do that?
For example, if I have a USERS table, with the columns of ID (number), FIRST_NAME, and LAST_NAME, what query do I use to insert a row into that table?
Also, what syntax do I use if I want to insert multiple rows at a time?
To insert a single row of data:
INSERT INTO USERS
VALUES (1, 'Mike', 'Jones');
To do an insert on specific columns (as opposed to all of them) you must specify the columns you want to update.
INSERT INTO USERS (FIRST_NAME, LAST_NAME)
VALUES ('Stephen', 'Jiang');
To insert multiple rows of data in SQL Server 2008 or later:
INSERT INTO USERS VALUES
(2, 'Michael', 'Blythe'),
(3, 'Linda', 'Mitchell'),
(4, 'Jillian', 'Carson'),
(5, 'Garrett', 'Vargas');
To insert multiple rows of data in earlier versions of SQL Server, use "UNION ALL" like so:
INSERT INTO USERS (FIRST_NAME, LAST_NAME)
SELECT 'James', 'Bond' UNION ALL
SELECT 'Miss', 'Moneypenny' UNION ALL
SELECT 'Raoul', 'Silva'
Note, the "INTO" keyword is optional in INSERT queries. Source and more advanced querying can be found here.
Here are 4 ways to insert data into a table.
Simple insertion when the table column sequence is known.
INSERT INTO Table1 VALUES (1,2,...)
Simple insertion into specified columns of the table.
INSERT INTO Table1(col2,col4) VALUES (1,2)
Bulk insertion when...
You wish to insert every column of Table2 into Table1
You know the column sequence of Table2
You are certain that the column sequence of Table2 won't change while this statement is being used (perhaps you the statement will only be used once).
INSERT INTO Table1 {Column sequence} SELECT * FROM Table2
Bulk insertion of selected data into specified columns of Table2.
.
INSERT INTO Table1 (Column1,Column2 ....)
SELECT Column1,Column2...
FROM Table2
I hope this will help you
Create table :
create table users (id int,first_name varchar(10),last_name varchar(10));
Insert values into the table :
insert into users (id,first_name,last_name) values(1,'Abhishek','Anand');
For example, "person" table has "id" IDENTITY column as shown below:
CREATE TABLE person (
id INT IDENTITY, -- Here
name NVARCHAR(50),
age INT,
PRIMARY KEY(id)
)
Then, we don't need to manually put a value to "id" IDENTITY column when inserting a row:
INSERT INTO person VALUES ('John', 27) -- The value for "id" is not needed
And, we can also insert multiple rows without the values for "id" IDENTITY column:
INSERT INTO person VALUES ('John', 27), ('Tom', 18)