Empty value from CSV import (sqlalchemy) to Postgres numeric column appears blank

Empty value from CSV import (sqlalchemy) to Postgres numeric column appears blank - sql

I imported a csv file via sqlalchemy to my postgres database using the create table with columns and attributes plus the copy-from method.
Now I have the following situation: I have empty cells from my csv file. In the database (which I visually access via SQL Workbench) I see that that there are empty cells - and the column specification is still numeric. However, as I have found so far, a numeric column cannot have empty / blank cells, but rather NULL.
On the other hand I tried to validate this via:
SELECT COUNT(column_a)
FROM table_name
WHERE column_a IS NULL;
which shows me 0 as a result; from which I infer that my empty appearing cells are not NULL.
The reason why I ask is: I would like to find all "real" 0s in my table and replace them either with empty / blank or the NULL (important is consistency here) because I need to stream data from the database and put a subset to a panda.dataframe; not sure how panda treats different formats of missing value.
Thank you.

I found indirectly the answer to the above question:
- empty value is treated in Postgres as NULL, in SQL Workbench you see empty cells though
- Count(Column) skips NULL so and will never count empty cells (Stackoverflow: validate-if-a-column-has-a-null-value

Related

SSIS importing percentage column from Excel displaying as NULL in database

I have an ETL process set up to take data from an Excel spreadsheet and store it in a database using SSIS. However, one of the columns in the the Excel file is formatted as a percent, and it will sometimes erroneously be stored as a NULL value in the database, as if there was some sort of translation error.
Pictured is the exact format being used for the column in Excel.
Interestingly, these percent values do load properly on some days, but for some reason one particular Excel sheet I was given as an example of this issue will not load any of them at all when put through the SSIS processor.
In Excel, these values will show up like "50.00%", and when the SSIS processor is able to translate them properly it will display as the decimal equivalent in the database, "0.5", which is what I want instead of the NULL values. The data type I am using in SSIS for this is Unicode string [DT_WSTR], and it is saved as an NVARCHAR in the database.
Any insight as to why these values will sometimes not display/translate as intended? I have tried messing around with the data types in SSIS/SQL Server, but it has either resulted in no change or error. When I put test values in the Excel sheet, such as "test" to see if it is importing anything at all from this column, it does seem to work (just not for the percent numbers that I need).

The issue was caused by the "mixed data types" that were present in the first few rows of my data (the "mixed" part being blank fields), which would explain why some sheets would work and others wouldn't.
https://stackoverflow.com/a/542573/11815822
Setting the connection string to accommodate for this fixed the issue.

Why do SQL varchars (256) not get populated to my flat file in SSIS package? [duplicate]

This question already has an answer here:
Failing to read String value from an excel column
(1 answer)
Closed 3 years ago.
I have a SSIS package which sources from an Excel file, performs a lookup in SQL, and then writes the fields from the lookup to a flat file. For some reason, any of the fields in the SQL table that are of data type varchar 256 are not getting written. They are coming in as nulls. My other fields, including varchar 255, are coming across fine. I have tried flat file and Excel as destination with no luck.
I've tried converting the varchar with a data conversion to both 256 and to a Unicode string and no luck.
Even when I preview a simple query in the source component (ex: select lastname from xyz), the preview shows the lastname as null. It doesnt show other fields that have different data types as nulls.

This is usually a case when the excel driver only reads the first 8 rows of data and misinterprets the correct data type because of the lack of data it's checking. Here are some of the known issues from the Microsoft site: Reference
Issues with importing
Empty rows
When you specify a worksheet or a named range as the source, the driver reads the contiguous block of cells starting with the first non-empty cell in the upper-left corner of the worksheet or range. As a result, your data doesn't have to start in row 1, but you can't have empty rows in the source data. For example, you can't have an empty row between the column headers and the data rows, or a title followed by empty rows at the top of the worksheet.
If there are empty rows above your data, you can't query the data as a worksheet. In Excel, you have to select your range of data and assign a name to the range, and then query the named range instead of the worksheet.
Missing values
The Excel driver reads a certain number of rows (by default, eight rows) in the specified source to guess at the data type of each column. When a column appears to contain mixed data types, especially numeric data mixed with text data, the driver decides in favor of the majority data type, and returns null values for cells that contain data of the other type. (In a tie, the numeric type wins.) Most cell formatting options in the Excel worksheet do not seem to affect this data type determination.
You can modify this behavior of the Excel driver by specifying Import Mode to import all values as text. To specify Import Mode, add IMEX=1 to the value of Extended Properties in the connection string of the Excel connection manager in the Properties window.
Truncated text
When the driver determines that an Excel column contains text data, the driver selects the data type (string or memo) based on the longest value that it samples. If the driver does not discover any values longer than 255 characters in the rows that it samples, it treats the column as a 255-character string column instead of a memo column. Therefore, values longer than 255 characters may be truncated.
To import data from a memo column without truncation, you have two options:
Make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters
Increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the following registry key:
Redistributable components version - Registry key
Excel 2016 - HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\16.0\Access Connectivity Engine\Engines\Excel
Excel 2010 - HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office\14.0\Access Connectivity Engine\Engines\Excel

Coloring Excel Cell based on a table condition

I am going through the users of a system and reviewing if they have appropriate role names. I then completed an excel table that looks abit like this:
I'm trying to turn the table into a more readable format. I have made a pivot that looks like this:
But I'm not sure how to highlight the cells to reflect the 'Access Appropriate? Yes/No' column. Ideally, it should be colored yellow if the 'Access Appropriate?' = 'No'. I'm thinking of using VBA, but was wondering if there is an easier solution using formulas or pivot table?

Your pivoted data isn't an actual excel pivot table, is it? I know what the x mean, but where do they come from?
Two possibilities come to mind if you want a flexible setup without VBA, aswell as an rather simple VBA-approach that uses an UDF.
Quick'n'dirty (really dirty) would be to
use 1/0 instead of yes/no (you could write that into a helper column with an if-function)
create a new pivot with ROLE_NAME for columns, USER_NAME for rows and SUM or MAX of [Access appropriate] for values
that means: instead of your x you will end up having 1 and 0. Empty cells will still be empty.
conditional format the value-range, e.g. If 1 then green If 0 then yellow if "" then Nothing
Alternatively, you could build your output-table with formulas like INDEX, MATCH and VLOOKUP-formulas.
An additional Key-Column with USERNAME&ROLE_NAME will be needed
conditional format the value-range
VBA: Provided your Rows are distinct a user defined function could do the following
read data into a recordset IF that hasnt been done already (meaning: declared on module-level, the first function call will fill it)
access the data in your recordset with a Recordset.Filter based on your input parameters - USERNAME and ROLE_NAME, in your case
output a certain Field.Value based on your input parameter - Access Appropriate in your case
conditional format the TRUE/FALSE values you get (since this can't easily be done inside an UDF)

Excel to SQL table field value appending with 0

I loaded an Excel file into an SQL table. The Excel file, one field consists of VARCHAR data (of data type general). When loaded into an SQL table, some of these values are prefixed with zero.
Example: in the Excel file it is 1081999 the same value become 01081999 in the SQL table.
What might be the reason for this ?

Excel will hide leading 0's as it identifies the fields content as a number and displays it as such. I would assume that the excel worksheet does indeed contain these leading 0's and they are simply not shown by Excel. If you change the type of the column from General to Text do they show up??
As a side note, if these are indeed numbers you should be storing them in a numeric datatype in the database...

Handlings Blanks in Yes/No columns

I have an excel file with one column that has either Yes/No or Blanks. When I import this into Access, the column data type is automatically set to Yes/No. When I open the table it mirrors what was in the excel file, which is fine, example below:
Tasked
Yes
No
No
When I use this table as part of make-table build – that includes several outer joins – the Tasked column reads -1/0 instead. The problem with this is that the blanks are set to 0, so I can't tell what job has yet to be Tasked.
I understand you can't have blanks in Yes/No column in Access, but in reality this doesn't help.
I have read several forums with advice on this but nothing is working?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas