Import data from Excel to PostgreSQL - sql

I have seen questions on stackoverflow similar/same as the one I am asking now, however I couldn't manage to solve it in my situation.
Here is the thing:
I have an excel spreadsheet(.xlsx) whom i converted in comma seperated value(.CSV) as it is said in some answers:
My excel file looks something like this:
--------------------------------------------------
name | surname | voteNo | VoteA | VoteB | VoteC
--------------------------------------------------
john | smith | 1001 | 30 | 154 | 25
--------------------------------------------------
anothe| person | 1002 | 430 | 34 | 234
--------------------------------------------------
other | one | 1003 | 35 | 154 | 24
--------------------------------------------------
john | smith | 1004 | 123 | 234 | 53
--------------------------------------------------
john | smith | 1005 | 23 | 233 | 234
--------------------------------------------------
In PostgreSQL I created a table with name allfields and created 6 columns
1st and 2nd one as a character[] and last 4 ones as integers with the same name as shown in the excel table (name, surname, voteno, votea, voteb, votec)
Now I'm doing this:
copy allfields from 'C:\Filepath\filename.csv';
But I'm getting this error:
could not open file "C:\Filepath\filename.csv" for reading: Permission denied
SQL state: 42501
My questions are:
Should I create those columns in allfields table in PostgreSQL?
Do I have to modify anything else in Excel file?
And why I get this 'permission denied' error?

Based on your file, neither of the first two columns needs to be an array type (character[]) - unlike C-strings, the "character" type in postgres is a string already. You might want to make things easier and use varchar as the type of those two columns instead.
I don't think you do.
Check that you don't still have that file open and locked in excel - if you did a "save as" to convert from xlsx to csv from within excel then you'll likely need to close out the file in excel.

SQL state: 42501 in PostgreSQL means you don't have permission to perform such operation in the intended schema. This error code list shows that.
Check that you're pointing to the correct schema and your user has enough privileges.
Documentation also states that you need select privileges on origin table and insert privileges on the destination table.
You must have select privilege on the table whose values are read by
COPY TO, and insert privilege on the table into which values are
inserted by COPY FROM. It is sufficient to have column privileges on
the column(s) listed in the command.

Yes I think you can. For COPY command, there is optional HEADER clause. Check
http://www.postgresql.org/docs/9.2/static/sql-copy.html
I don't think so. With my #1 and #3, it should works.
You need superuser permission for that.

1) Should I create those columns in allfields table in PostgreSQL?
Use text for the character fields. Not an array in any case, as #yieldsfalsehood pointed out correctly.
2) Do I have to modify anything else in Excel file?
No.
3) And why I get this 'permission denied' error?
The file needs be accessible to your system user postgres (or what ever user you are running the postgres server with). Per documentation:
COPY with a file name instructs the PostgreSQL server to directly read
from or write to a file. The file must be accessible to the server and
the name must be specified from the viewpoint of the server.
The privileges of the database user are not the cause of the problem. However (quoting the same page):
COPY naming a file or command is only allowed to database superusers,
since it allows reading or writing any file that the server has privileges to access.

Regarding the permission problem, if you are using psql to issue the COPY command, try using \copy instead.

Ok the Problem was that i need to change the path of the Excel file. I inserted it in the public account where all users can access it.
If you face the same problem move your excel file to ex C:\\User\Public folder (this folder is a public folder without any restrictions), otherwise you have to deal with Windows permission issues.

For those who do not wish to move the files they wish to read to a different location(public) for some reason. Here is a clear solution.
Right click the folder holding the file and select properties.
Select the Security tab under properties.
Select Edit
Select Add
Under the field Enter the object Names to select, Type in Everyone
Click OK to all the dialog boxes or Apply if it is activated
Try reading the file again.

Related

Specify multiple delimiters for Redshift copy command

Is there a way to specify multiple delimiters to Redshift copy command while loading data.
I have a data file having the following format:-
1 | ab | cd | ef
2 | gh | ij | kl
I am using a command like this:-
COPY MY_TBL
FROM 's3://s3-file-path'
iam_role 'arn:aws:iam::ddfjhgkjdfk'
manifest
IGNOREHEADER 1
gzip delimiter '|';
Fields are separated by | and records are separated using newline. How do I copy this data into Redshift. Because my query above gives me a delimiter not found error
No, delimiters are single characters.
From Data Format Parameters:
Specifies the single ASCII character that is used to separate fields in the input file, such as a pipe character ( | ), a comma ( , ), or a tab ( \t ).
You could import it with a pipe delimiter, then perform an UPDATE command to STRIP() off the spaces.
Your error above suggests that something in your data is causing the COPY command to fail. This could be a number of things, from file encoding, to some funky data in there. I've struggled with the "delimiter not found" error recently, which turned out to be the ESCAPE parameter combined with trailing backslashes in my data which prevented my delimiter (\t) from being picked up.
Fortunately, there are a few steps you can take to help you narrow down the issue:
stl_load_errors - This system table contains details on any error logged by Redshift during the COPY operation. This should be able to identify the row number in your data file that is causing the problem.
NOLOAD - will allow you to run your copy command without actually loading any data to Redshift. This performs the COPY ANALYZE operation and will highlight any errors in the stl_load_errors table.
FILLRECORD - This allows Redshift to "fill" any columns that it sees as missing in the input data. This is essentially to deal with any ragged-right data files, but can be useful in helping to diagnose issues that can lead to the "delimiter not found" error. This will let you load your data to Redshift and then query in database to see where your columns start being out of place.
From the sample you've posted, your setup looks good, but obviously this isn't the entire picture. The options above should help you narrow down the offending row(s) to help resolve the issue.

Get file info from file path in SQL Server database

I have a document table in a SQL Server 2008 R2 database with a structure like this:
id | date_created | file_path | file_type
---+--------------+-----------------------+--------------------
1 | 2016-11-14 | \\server\docs\123.doc | application/msword
2 | 2016-11-15 | \\server\imgs\456.png | image/png
I need to determine the file size of a subset of documents. So I have a query that will select certain rows from the document table (based on their ID) and I would need to find out what the total file size is of that set of documents. I did some Googling (before coming here of course) but most things I can find related to files/SQL is about log file sizes which is obviously NOT what I want.
Any and all help is appreciated as always! Thanks!
I'm answering my own question for completeness...
I was unable to use any of the options provided in the comments on the question, due to limitations in the Production environment where this query needed to be run. Instead, I ran the query to select the desired rows and exported it to a CSV file. I then wrote a quick and dirty Java program (only because it's my most comfortable language and I had similar projects from the past that I could reuse) that took a CSV file as an argument and parsed the CSV and checked each of the files and output a total file size in the console. While it doesn't solve the original question in SQL, it did resolve the problem in this particular case.
Note: If anyone has a SQL solution they can submit as an answer that I can verify as a valid answer, I will switch to that as the accepted answer.

How do I update a database that's in use?

I'm building a web application using ASP.NET MVC with SQL Server and my development process is going to be like
Make changes in SQL Server locally
Create LINQ-to-SQL classes as necessary
Before committing any change set that has a database, script out the database so that I can regenerate it if I ever need to
What I'm confused about is how I'm going to update the production database which will have live data in set.
For example, let's say I have a table like
People
========================================
Id | FirstName | LastName | FatherId
----------------------------------------
1 | 'Anakin' | 'Skywalker' | NULL
2 | 'Luke' | 'Skywalker' | 1
3 | 'Leah' | 'Skywalker' | 1
in production and locally and let's say I add an extra column locally
ALTER TABLE People ADD COLUMN LightsaberColor VARCHAR(16)
and update my LINQ to SQL, script it out, test it with sample data and decide that I want to add that column to production.
As part of a deployment process, how would I do that? Does there exist some tool that could read my database generation file (call it GenerateDb.sql) and figure out that it needs to update the production People table to put default values in the new column, like
People
==========================================================
Id | FirstName | LastName | FatherId | LightsaberColor
----------------------------------------------------------
1 | 'Anakin' | 'Skywalker' | NULL | NULL
2 | 'Luke' | 'Skywalker' | 1 | NULL
3 | 'Leah' | 'Skywalker' | 1 | NULL
???
You should have a staging DB that is identical to the production database.
When you add any changes to the database, you should perform these changes to the staging DB first, and you can of course compare the dev and staging DB to generate a script with the difference.
Visual Studio has a Schema Compare that generate a script with the differences between a two databases.
There are some other tools a well that does the same.
So, you can generate the script, apply it to the staging Db and if everything went fine, you can apply the script on the production DB
Actually that is right you must have a staging process whenever we commit features we use TFS from Development to Production that is called staging you can look up the history of the TFS whether the database or the solution. and if you're not using TFS in Visual Studio and MSSQL Server.
I guess that your are commiting youre features directly to your server that is your production test. or you can test that in your test server first to see the changes.
Another thing is that I guess if you use stored procedures you can use Temporary Tables if you're asking about the script.
I guess that it's your first time commiting in a live server..

Dynamic query sqlplus

I have a list of account names in listnames.txt
listnames.txt
James
Joey
Pete
I want to query those list of names in table account Using SQLPLUS the listnames.txt if keep on changing in sqlplus can do loop reading listname.txt?
Account table
1|Mike
2|James
3|Harris
4|Joey
5|Carl
6|Pete
Thanks
jigo
Its better you create a external table and based on that data, loop through your lookup table.
With external table you dont load data into db. it just creates a metadata structure of the file. so even if you file changes you can read the changed data directly from the file with a simple select, no need to load again and again.

Updating multiple rows with information stored in text file

I have a comma delimited text file containing discrepancies across two different databases, and need to update one of the databases with information from the aforementioned text file. The text file is in the following format:
ID valueFromDb1 valueFromDb2
1 1234 4321
2 2345 5432
... ... ...
I need to go update a table by checking for the ID value, and where valueFromDb1 exists replace it with valueFromDb2. There are around 11,000 rows that need to be updated. Is there a way I can access the information in this text file directly through an sql query? My other thought was to write a java program to do this for me, but I'm not convinced that is the easiest solution.
The article below demonstrates one way to read a text file in MS SQL Server by using xp_cmdshell. In order for it to work the file has to be on one of the drives of the server. Once you have the file loaded into a table variable (which is what the code in the article will do) you should be able to do the joins and updates pretty easily. Let us know if you need any other help.
http://www.kodyaz.com/articles/read-text-file-using-xp_cmdshell.aspx