I need some sort of starting point for the following task:
Some kind of script should watch a certain folder for incoming excel files. These excel files all have a worksheet with an equal name. The script should then take a certain range of columns of each excel file run down all the rows and write the column data to a running Microsoft SQL Server table.
Now I don't know what scripting language I should/could use for this task. Maybe perl or windows power shell?
Hope you can point me to the right direction.
UPDATE: shouldn't I also look into SSIS since this seems to offer quite a lot of features?
Thank you.
You can create a Windows Service that can monitor a certain folder for a certain interval (say 10 minutes) using .Net.
Using ADO.Net, you can connect to both Excel workbooks and SQL sever perform SQL style data transformations. If the Excel doc isn't conducive to performing SQL queries, there's alway MS Office interop to interface with Excel to select specific values of cells (this tends to be more difficult than the former).
I would probably write a script in Perl or Python trying to call the file in the folder, and if successful, parse the data into either dictionaries/hashes (would be really easy to break rows and columns into a hash), or arrays, making it easy to write to a database.
Then (my knowledge is better in Python, sorry) use the PyODBC module, or whatever other module is necessary to connect to the server, and start writing. I am sorry if this is not as helpful, I am a beginner.
Related
I am working using Tableau and have to write down multiple different SQL each time, while making new data sources.
I have to save all changes on SQL for every data source.
Currently I would paste the SQL on notepad and save them on separate folder in my computer, along with description of the changes.
Is there any better way to do this?
Assuming you have permission to create objects in the database, begin by creating database views, As #Nick.McDermaid commented.
Then, instead of using Custom SQL data source in Tableau, just connect to the View as if it were a table.
If you need to track the changes to these SQL views of your data, you will need to learn how to use source control for the .sql files that can be scripted from within SQL Server Management Studio:
Your company or school may have a preferred source control system already in use, in which case you should use that. If they don't, or if you are learning at home, then Git and Subversion are popular open source choices.
There are many courses available on learning platforms like Coursera that will teach you how to learn how to use those systems.
I had similar problem as you.
We ended up writing the queries in SQL Editor SQL Work bench (https://www.sql-workbench.eu/), then managed the code history and performed code peer-review (logic, error check, etc) in team shared space (like confluence).
The reasons we did that is
1) SQL queries are much easy to write on Work Bench
2) Code review is a must! You will find through implementing a review process more mistakes than you could ever think about
3) The shared space is just really convenient as it is accessible by everyone, and all errors are documented. After sometimes you get a lot of visible knowledge accumulated.
I also totally agree with Nick as this is one step to a reporting solution. But developing a whole reporting server is heavy, costly and takes time. Unless management are really convinced of the importance of developing a reporting solution, you may have to get a workaround with queries and Tableau (at least that was the case for us)
A little late to the party, but I would suggest you simply version the tableau workbook. The contents of the workbook are XML, so perfect for versioning using file based tools (Dropbox, One Drive, etc.) or source control (git, etc.). The workbooks themselves are usually quite small, so just make sure to keep the extract data separate if you use it.
i have this homework to do, but i just dont know how to attack the problem, i need to make a program in python that can make consults between an excel database and an sql database in the same program without changing the database my teacher say that i can use json but i dont know how to do it, i dont want the code, i just need the tools for solve the problem thank you
If you do end up using python, I would recommend openpyxl for reading and writing to an Excel sheet, and MySQL connector to connect to your database. I find this combo quick and intuitive to use.
I am new to Access, I am a C programmer who also worked with Oracle. Now I am creating an Access database for a small business with Access front-end and SQL Server back-end. My database has about 30 small tables (a few hundreds records each) and a rather complicated algorithm.
I don't use VBA code because I don't have time for learning VBA, but I can write complicated SQL statements, so I use a lot of queries and macros.
I am trying to minimize the daily growth of my database. I've thought about splitting the Access DB. It doesn't make sense because my DB is rather small. After compacting its size is about 5 MB. The regular compact procedure is not convenient because my client's employees work from home any time they wish. So I need to create a DB that would bloat as slowly as possible.
I did some research and found a useful info: "the most common causes of db bloat are over-use of temporary tables and over-use of non-querydef SQL" (http://www.access-programmers.co.uk/forums/showthread.php?t=48759). Could somebody please clarify that for me? I have 3 questions about that:
1) I cannot help using temporary tables, I tried re-using the same table names in 2 ways:
a) first clear all records and then run an append query or
b) first run a Macro command "DeleteObject" (to free the space in full) and then re-create the temporary table.
Can somebody please advise which way is better in order to reduce the DB growth?
2)After running a stored query I cannot free the space like I did in C using VBA statement "query.close" (because I don't use VBA). But I can run Macro command "close query" after each "OpenQuery". Will it help or just double the length of my Macros?
3)Is it correct that I shouldn't use Macro commands RunSQL for simple SQL statements and create stored queries instead? Even though it will create additional stored queries.
Any help would be appreciated!
Ah the joys of going back to lego after being a brickie! :)
1) Access is essentially a text-based file system. When you delete a record or a table, is persists in the file but with a flag which marks it to be ignored. When you compact an Access db, the executable creates a new file, and moves everything unmarked into that, then deletes the old file. You can see this actually happening if you use Windows Explorer to monitor the folder during this process.
You mention you are using SQL Server, is there a reason you are not building the temp tables on the server? This would be a faster, cleaner and all-round more efficient solution - unless we've missed something. Failing that, you will have to make the move from macros, but truthfully, if you can figure out C, then VBA will be like writing a memo!
http://www.access-programmers.co.uk/forums/showthread.php?t=263390
2) issuing close commands for saved queries in Access has no impact on the file-bloat issue, they just look untidy
3) yes, always used saved queries, since this allows Access to compile the SQL in advance, and optimise execution.
ps. did you know you can call SQL Server Stored Procs from within an Access saved query?
https://accessexperts.com/blog/2011/07/29/sql-server-stored-procedure-guide-for-microsoft-access-part-1/
If at all possible, you should look for ways to dispense with the Access back-end, since you already have SQL Server as the backend - though I suspect you have your reasons for this.
I'm perfectly happy with excel; I know the codes and I find the interface very intuitive. The only problem I have now is that I have lots of formulas in several columns, which are linked to other excel files and am tracking sales over time. Currently I have 1500+ rows of data and sometimes Excel has trouble to calculate all the necessary codes and I need a way to make sure that into the future, when there are 10000+ or more rows, it is possible to run the code without Excel stopping/freezing. My boss says using SQL should help. However, I am unfamiliar with it and know that excel and SQL can be used similarly.
Ultimately, I want to know if I can run the excel code in SQL or if I can calculate small datasets (sets that are pulled periodically) in excel and them export to SQL automatically instead of having to go through the wizard for importing data. Also, I would need to attach the small datasets into the large one. Any ideas other than just learning SQL? This needs to be accessible to many people who don't know SQL so simply learning SQL isn't too helpful.
If you're familiar with Excel and its formulas, it won't be too arduous for you to pick up on SQL. In addition to that, I can copy and paste outputs from SQL Sever into Excel sheets that have a graph auto-built (since I use Express) and it's a simple copy-and-paste into an Excel sheet.
While I don't know all your calculations, I haven't seen Excel be able to do something that SQL couldn't do and when you consider the benefit of indexing among the freedom to organize your data how you want (and have saved stored procedures), a switch might only be temporarily inconvenient while you pick up on SQL, and after that, you'll easily produce what you need.
Ultimately, I want to know if I can run the excel code in SQL
Not always exactly identical, but you can run similar structured code. For instance:
SELECT AVG(Sales) "AverageSales"
FROM Sales
VERSUS
=AVERAGE(A2:A2000)
or
SELECT (((DollarToday - DollarYesterday)/DollarYesterday)*100) AS "DollarDelta"
FROM USD
VERSUS
=(((A2-A1)/A1)*100)
My company has a legacy micro-simulation program that simulates a population and changes to that population over a period of years.
For each year, the program produces a binary file with a record for each individual that holds their characteristics (e.g., age, maritial status, income ... about 20 fields).
We currently have several utility programs that read these files and produce summary reports. Problem is that each time somebody wants a new report, a new utility program has to be written.
Changing the program so that the records are stored in a database instead of binary files is out of the question (I have asked ... several times). I have written a few programs that import the binary files into a database and then run queries on the tables I have created. The problem here is that it always takes longer to import the data and run the query than it does to run a utility program written in c++ that just read the records one by one and accumulate the desired data. Often the binary files contain over 30 million records and the import step alone takes forever.
So here is my question. Is there anything out there that would allow me to specify the structure of my binary file and then run SQL queries on the file? I think you can use ODBC to run queries on plain text files, but I've never seen anything like that for binary files.
If there isn't anything available, what are the steps I would need to take to build something that could run a query directly on my file? I understand this would probably be way beyond my ability, but it can't hurt to know where I would need to start.
OpenAccess is a toolkit that you can use to build ODBC or JDBC drivers for arbitrary systems. Disclaimer: I've not used it, and another division of my company sells it.
It's possible using SSIS: Loading Binary Files into SQL Server Using SSIS
This amight also be of interest: Reading and Writing Files in SQL Server using T-SQL
I do not have much experience with LINQ, but couldn't you use InteropServices to parse the binary files into C# objects and then query stuff out with LINQ's SQL?