How to model data flows with a SQL backend?

How to model data flows with a SQL backend? - sql

My question is not about a specific code. I am trying to automate a business data governance data flow using a SQL backend. I have put a lot of time searching the internet or reaching out people for the right direction, but unfortunately I have not yet found something promising so I have a lot of hope I would find some people here to save from a big headache.
Assume that we have a flow (semi static/dynamic flow) for our business process. We have different departments owning portions of data. we need to take different actions during the flow such as data entry, data validation, data exportation, approvals, rejections, notes etc and also automatically define deadlines, create reports of overdue tasks and people accountable for them etc.
I guess the data management part would not be extremely difficult, but how to write an application (codes) to run the flow (workflow engine) is where I struggle. Should I use triggers or should I choose to write codes to frequently run queries to push the completed steps to next step, how I can use SQL tables to keep the track of flow etc
If one could give me some hints on this matter, I would be greatly appreciated

I would suggest using the sql server integration services SSIS, you can easily mange the scripts and workflow based on some lookup selections, and also you can schedule SSIS package on timely bases to trigger and do the job.

It's hard task to implement application server on sql server. Also it's will be very vendor depended solution. Best way i think to use sql server as data storage and some application server for business logic over data storage.

Related

Join my app database with database from software

I have been hurting a wall for quite a while now, I am making an application linked to a software that we are using, which will allow the user to either access data from the software with my application and update data with my application on the software.
So here is the whole idea:
So my app will be linked to the software's database (Software Patient) with the help of foreign key (patientId on "App Patient").
And I need to be able to search for email, password, firstName, lastName, secretStuff directly from my app and be able to update data as well on both databases.
The biggest issue here is that I can't make a third table that merge all the data into one because the data from the software's database (Software Patient) will be updated quite a lot directly from the software by others people.
The current stack is composed of :
My application: Node.js with Sequelize, GraphQL & PostgreSQL
Software that we use: SQL Server Express
Thank you in advance!

The app you are developing must get data from your commercial Software Patient (we'll call it SP) system. That presents several questions. You really really need clear answers to these questions to finish designing the data flow in your app. Some of the questions:
How will your app get data from SP? Will you issue SQL queries to SP's database? Does SP publish an Application Programmer Interface (API) for this purpose? Or a data export function you'll use in you app's workflow?
Must your app's view of SP data be up-to-the-minute? Will an hourly update be enough? Daily?
Will your app change SP data, insert new data, or delete data in the SP system? If so see the first question.
Must you reverse-engineer SP, that is, guess how its data is structured, to make your app work? Or can you get specs / documentation from SP's developers?
If you update a reverse-engineered database, dude, be careful!
If your app will use SQL to get data from SP, it will send that SQL to SP's SQL Server Express database. nodejs has tooling for that, but both the tooling and the SQL dialect used in postgreSQL are different. Maybe it would be wise to use SQL Server throughout: doing so reduces the cognitive load on people who will maintain and enhance your app in the future. Neither they nor you will have to keep straight the differences between the two DBMSs.
If you'll use an API, great! That's a clean interface between two systems. (It will probably have some irritating and confusing bugs, so allow some time for that. I've had to send pull requests to several API maintainers.)
If you figure out the answers to these sorts of questions, you'll make a good decision about your question of the third table. It's impossible to address your specific third-table question without some of these questions.
And. Please. Don't forget infosec. You have a duty to keep personal data of the patients you serve away from cybercreeps.

What are ways to transfer tables from Oracle to SQL Server

I've been searching the internet for this question:
What are ways to transfer data and tables on a daily basis from an Oracle's Hyperion to SQL Server 2000?
I am an intern at a company and trying to figure out possible ways to do this. Any help or point in the right direction is greatly appreciated

This is going to depend a lot on specifics. Here are just a few possible solutions:
DTS
DTS is packaged with SQL 2000 and is made for this kind of a task. If written correctly, your DTS package can have good error-handling and be rerunnable/reusable.
SSIS
SSIS is actually packaged with SQL 2005 and above, but you can connect it to other databases. It's basically a better version of DTS. (technically it's radically different than DTS, but has a lot of the same functionality)
Linked Servers
From SQL 2000 you should be able to connect directly to your Oracle database as a linked server. In the pros column this kind of direct access can be easy to work with if you don't have any other technical skills such as DTS or SSIS, but it can be complex to get the initial set-up right and there may be security concerns/issues.
Build Your Own
Depending on what other technologies you use you can build your own application to do the ETL (Extract/Transform/Load, which is what you're doing). This could be in .NET, Java, etc. In the pros column you can use something with which you're familiar but there's a big downside here in that most of the low level type of work is already out there in tools like DTS/SSIS, so why reinvent the wheel?
BCP
You can simply extract the data from Oracle as .csv files (or some other format) and then import them back in using SQL Server's Bulk Copy Process. This can be fast, but there aren't many bells and whistles to go with this. If this is a one-time thing with just a few tables though then this is probably the easiest and fastest way to do it.
Third Party Applications
There are a slew of ETL applications already written out there (Data Import, Data Slave, etc.). They will usually provide wizards and one-click solutions (maybe a few more than one click), but they are also going to cost a bit of extra money.
EDIT:
Given your latest comment, I would probably go with a DTS package that's scheduled in SQL Agent to run daily. You can add in error-handling and have the system email/text/call someone if there's ever an issue (or do positive case reporting - ie. send a message when it's successful so that someone knows that there's a problem if they don't get a message each day.

In our company we use ADO.Net for the same task.
We created a source to Oracle , taking all data and then creating it in SQL server

You could write DTS packages to copy the data, and schedule them to run within Sql Server Agent.
See DTS Overview for information on DTS packages.
Here's a tutorial on creating a DTS package: Creating DTS Packages With SQL Server 2000

Oracle Hyperion is a suite of products, largely unrelated to Oracle's database product. I expect you are referring to a product such as Hyperion Financial Management or Hyperion Strategic Finance. These products have APIs that can be consumed using COM Interop or web services. The data can be extracted from the internal multidimensional database by analyzing the database metadata, creating dimension trees, and then using the information to create selections, that represent subcubes within the database; allowing you to get or set cell data.
I don't know what your level of knowledge of multidimensional databases is, but unless it is substantial you may find the task pretty hard. You also need to get a handle on the particular product API.
My company specializes in these kinds of activities, and we have components for this kind of thing. Drop me a line on my blog if you need further advice.
danielvaughan.org
Cheers,
Daniel

I don't know anything about Hyperion, but SQL Server 2000 is very old and may not have a driver to be able to pull data from Hyperion if the version of that is newer than the year 2000. You may need to look to see if there is a way to push the data from Hyperion rather than pull it into SQL Server 2000. One way i have done this is the past is to create pipe delimited text file from the data base that orginally has the data and palce it in a processing directory. I do know that DTS will process a pipe-delimited text file. So if you can't find a driver to process this data directly, consider if you can push it out to file and then process. You wil have to schedule a time gap between the job on Hyperion that creates the file and the DTS package job. But if you are only doing it once a day, that's prbably not a problme.

What is the fastest way for me to take a query and turn it into a refreshable graph of the results set?

I often find myself writing one off queries to either answer someone's question or trouble shoot something and I would like to be able to quickly expose the on demand refreshable results of the query graphically so that I can share these results to others without having to go through the process of creating an SSRS report and publishing it to a reporting services server.
I have thought about using excel to do this or maybe running a local SSRS server but both of these options are still labor intensive and I cannot justify the time it would take to do these since no one has officially requested that I turn this data into a report.
The way I see it the business I work for has invested money in me creating these queries that often return potentially useful data that other people in the organization might want but since it isn't exposed in any way and I don't know that this data is something they want and they may not even realize they want this data, the potential value of the query is not realized. I want to increase the company's return on investment on all these one off queries that I and other developers write by exposing their results graphically so that they can be browsed by others and then potentially turned into more formalized SSRS reports if they provide enough value to justify the development of the report.
What is the fastest way for me to take a query and turn it into a refreshable graph of the results set?

Why dont you simply use what you may already have. Excel...you can import data via an ODBC / Oracle / SQL Connection. Get Data..and bam you can run the query and format it right in the spreadsheet and provide sorting etc. All you need to supply is the database name and user name and password to connect to the db.

JonH is right regarding Excel's built in ODBC support, but I have had tons of trouble with this. In my case, the ODBC connection required the client software to be installed so that it could use the encryption methods, etc. Also, even if that were not the case, the user (I believe) would still have to manually install and set up an ODBC connection.
Now if you just want something on your machine to do the queries and refresh them, JohH's solution is great and my caveats are probably irrelavent. But if you want other users to have access, you should consider having a middle-man app (basically a PHP script, assuming a web server is an option for you), that does a query, transforms the results into XML, and outputs it as "report-xyz.xml". You can then point anybody running a newer version of Excel to that address and they can very easily import the data into Excel with no overhead. (basically a kind of web service).
Keep in mind, I don't think you should have a web script that will allow users to make queries to your Database server! You would have some admin page where you make pass the query in and a new xml file with the results gets made. So my idea is also based on the idea that you want to run the same queries over and over without any specifics passed in. (if that were the case, I'd look into just finding a pre-built web services bridge for your database that already has security features built in. Then you could have users make the limited changes allowed.)

What strategies are available for migrating Access databases to SQL server-based applications?

I'm considering undertaking a project to migrate a very large MS Access application to a new system based on SQL Server. The existing system is essentially an ERP application with a couple of dozen users, all sharing the Access database over the network. The database has around 300 tables and lots of messy VBA code. This system is beginning to break down (actually, it's amazing it has worked as long as it has).
Due to the size and complexity of the Access application, a 'big bang' approach is not really feasible. It seems sensible to rope off chunks of functionality and migrate them piecemeal to the new system. During the migration process, which I expect to take several months, there may be a need for both databases to be in operation and be able to query and modify data in both systems.
I have considered using something like the ADO.NET Entity Framework to implement a data abstraction layer to handle this, but as far as I can tell, the Entity Framework has no Access provider.
Does my approach seem reasonable? What other strategies have people used to accomplish similar goals?

You may find that the main problem is using the MS Access JET engine as the backend. I'm assuming that you do have an Access FE (frontend) with all objects except tables, and a BE (backend - tables only).
You may find that migrating the data to SQL Server, and linking the Access FE to that, would help alleviate problems immediately.
Then, if you don't want to continue to use MS Access as the FE, you could consider breaking it up into 'modules', and redesign modules one by one using a separate development platform.

We faced a similar situation a few years ago, but we knew from the beginning that we'll have to swich one day to SQL SERVER, so the whole code was written to work from an Access client to both Access AND SQL server databases.
The idea of having a 'one-step' migration to SQL server is certainly the easier way to manage this on the database side, and there are many tools for that. But, depending on the way your client app talks to the database, your code might then not work properly. If, for example, your code includes a lot of SQL instructions (or generates them on the fly by, for example, adding filters to SELECT instructions), your syntax might not be 'SQL server' compatible: access wildcards, dates, functions, will not work on SQL server.
In addition to this, and as said by #mjv, the other drawback of a one time switch to MS SQL is that you will inheritate many of the problems from the original database: wrong or inapropriate field names, inapropriate primary/foreign key policies, hidden one-to-many relations that you'd like to implement in the new database model, etc.
I'll propose here some principles and rules to implement a 'soft transition' solution, which clearly best fits you. Just to say that it's not going to be easy, but it's definitely very interesting, paticularly when dealing with 300 tables! Lucky you!
I assume here that yo have the ability to update the client code, and you'd prefer to keep at all times the same client interface. It is of course possible to have at transition time two different interfaces, one for each database, but this will be very confusing for the users, and a permanent source of frustration for them.
According to me, the best solution strongly depend on:
The original connection technology,
and the way data is managed in your
client's code: Access linked tables,
ODBC, ADODB, recordset, local
tables, forms recordsources, batch
updating, etc.
The possibilities to split your
tables and your app in 'mostly
independant' modules.
And you will not spare the following mandatory activities:
setup up of a transfer
procedure from Access database to SQL server. You
can use already existing tools (The
access upsizing wizard is very poor,
so do not hesitate to buy a real
one, like SSW or EMS SQL Manager,
very powerfull) or build your own
one with Visual Basic. If your plan
is to make some changes in Data
Definition, you'll definitely have
to write some code. Keep in mind
that you will run this code
maaaaaany times, so make sure that
it includes all time-saving
instructions that will allow you to
restart the process from the start
as many times as you want. You will
have to choose between 2 basic data
import strategies when importing data:
a - DELETE existing record, then INSERT imported record
b - UPDATE existing record from imported record
If you plan to switch to new Primary\foreign key types, you'll have to keep track of old identifiers in your new database model during the transition period. Do not hesitate to switch to GUID Primary Keys at this stage, especially if the plan is to replicate data on multiple sites one of these days.
This transfer procedure will be divided in modules corresponding to the 'logical' modules defined previously, and you should be able to run any of these modules independantly (keeping of course in mind that they'll probably have to be implemented in a specific order, where the 'customers' module has to run before the 'invoicing' module).
implement in your client's code the possibility to connect to both original ms-access database and new MS SQL server. Ideally, you should be able to manage from within your code both connections for displaying and validating data.
This possibility will be implemented by modules, where you will have, for each of them, a 'trial period', ie the possibility to choose at testing time between access connection and sql connection when using the module. Once testing is done and complete, the module can then be run in exclusive SQL server mode.
During the transfer period, that can last a few months, you will have to manage programatically the database constraints that exist between 'SQL server' modules and 'Access' modules. Going back to our customers/invoicing example, the customers module will be first switched to MS SQL. Before the Invoicing module can be switched, you'll have to implement programmatically the one to many relations between Customers and Invoices, where each of the tables will be in a different database. Such a constraint can be implemented on the Invoice form by populating the Customers combobox with the Customers recordset from the SQL server.
My proposal is to build your modules following your database model, allways beginning with the 'one' tables or your 'one-to-many' relations: basic lists like 'Units', 'Currencies', 'Countries', shall be switched first. You'll have a first 'hands on' experience in writting data transfer code, and managing a second connection in your client interface. You'll be then able to 'go up' in your database model, switching the 'products' and 'customers' tables (where units, countries and currencies are foreign keys) to the new server.
Good luck!

I would second the suggestion to upsize the back end to SQL Server as step 1.
I would never go to the suggested Step 2, though (i.e., replacing the Access front end with something else). I would instead suggest investing the effort in fixing the flaws of the schema, and adjusting the Access app to work with the new schema.
Obviously, it is never the case that everything just works hunky dory when you upsize -- some things that were previously quite fast will be dogs, and some things that were previously quite slow will be fast. And I've found that it is often the case that the problems are very often not where you anticipate that they will be. You can only figure out what needs to be fixed by testing.
Basically, anything that works poorly gets re-architected, or moved entirely server-side.
Leverage the investment in the existing Access app rather than tossing all that out and starting from scratch. Access is a fine front end for a SQL Server back end as long as you don't assume it's going to work just the same way as it would with a Jet/ACE back end.

...thinking out loud... I think this may work.
I appears that the complexity of the application resides in the various VBA modules rather than the database table/schema themselves. A possible migration path could therefore be to first migrate the data storage to SQL server, exactly as-is, as follow:
prevent any change to the data for a few hours
duplicate all tables to the SQL server; be sure to create the same indexes as well.
create linked tables to ODBC Source pointing to the newly created tables on SQL Server
these tables should have the very same name as the original tables (which therefore may require being renamed, say with a leading underscore, for possible reference).
Now, the application can be restarted and should be using the SQL tables rather than the Access tables. All logic should work as previously (right...), possible slowness to be expected, depending on the distance between the two machines.
All the above could be tested in about a day's work or so; the most tedious being the creation of the tables on SQL server (much of that can be automated, I'm sure). The next most tedious task is to assert that the application effectively works as previously, but with its storage on SQL.
EDIT: As suggested by a comment, I should stress that there is a [fair ?] possibility that the application would not readily work so smoothly under SQL server back-end, and could require weeks of hard work in testing and fixing. However, and unless some of these difficulties can be anticipated because of insight into the application not expressed in the question, I propose that attempting the "As-is" migration to SQL Server should be considered; after all, it may just work with minimal effort, and if it doesn't, we'd know this very quickly. This is therefore a hi-return, low risk proposal...
The main advantage sought with this approach is that there will be a single storage during the [as the OP expects] longer period during which the old Access application will co-exist with the new application.
The drawback of this approach, is that, at least at first, the schema of original database is reproduced verbatim, i.e. including some of its known quirks and legacy-herited idiosyncrasies. These schema issues (and the underlying application logic) can be in time corrected, but this is of course less easy than if the new application starts ab initio, with its own, separate, storage, and distinct schema.
After the storage is moved to SQL server, the most used and/or the most independent modules of the Access application can be re-written in the new application, and as significant portions of the original application is ported, effective usage, by select beta testers or by actual users can start to be switched to the new application.
Possibly, some kind of screen-scraping based logic or some other system could be used to produce an hybrid application which would provide the end users with a comprehensive application, which sometimes work from new logic, and sometimes from the original MS-Access program.

How do I keep a table synchronized with a query in SQL Server - ETL?

I wan't sure how to word this question so I'll try and explain. I have a third-party database on SQL Server 2005. I have another SQL Server 2008, which I want to "publish" some of the data in the third-party database too. This database I shall then use as the back-end for a portal and reporting services - it shall be the data warehouse.
On the destination server I want store the data in different table structures to that in the third-party db. Some tables I want to denormalize and there are lots of columns that aren't necessary. I'll also need to add additional fields to some of the tables which I'll need to update based on data stored in the same rows. For example, there are varchar fields that contain info I'll want to populate other columns with. All of this should cleanse the data and make it easier to report on.
I can write the query(s) to get all the info I want in a particular destination table. However, I want to be able to keep it up-to-date with the source on the other server. It doesn't have to be updated immediately (although that would be good) but I'd like for it be updated perhaps every 10 minutes. There are 100's of thousands of rows of data but the changes to the data and addition of new rows etc. isn't huge.
I've had a look around but I'm still not sure the best way to achieve this. As far as I can tell replication won't do what I need. I could manually write the t-sql to do the updates perhaps using the Merge statement and then schedule it as a job with sql server agent. I've also been having a look at SSIS and that looks to be geared at the ETL kind of thing.
I'm just not sure what to use to achieve this and I was hoping to get some advice on how one should go about doing this kind-of thing? Any suggestions would be greatly appreciated.

For that tables whose schemas/realtions are not changing, I would still strongly recommend Replication.
For the tables whose data and/or relations are changing significantly, then I would recommend that you develop a Service Broker implementation to handle that. The hi-level approach with service broker (SB) is:
Table-->Trigger-->SB.Service >====> SB.Queue-->StoredProc(activated)-->Table(s)
I would not recommend SSIS for this, unless you wanted to go to something like dialy exports/imports. It's fine for that kind of thing, but IMHO far too kludgey and cumbersome for either continuous or short-period incremental data distribution.

Nick, I have gone the SSIS route myself. I have jobs that run every 15 minutes that are based in SSIS and do the exact thing you are trying to do. We have a huge relational database and then we wanted to do complicated reporting on top of it using a product called Tableau. We quickly discovered that our relational model wasn't really so hot for that so I built a cube over it with SSAS and that cube is updated and processed every 15 minutes.
Yes SSIS does give the aura of being mainly for straight ETL jobs but I have found that it can be used for simple quick jobs like this as well.

I think, staging and partitioning will be too much for your case. I am implementing the same thing in SSIS now but with a frequency of 1 hour as I need to give some time for support activities. I am sure that using SSIS is a good way of doing it.
During the design, I had thought of another way to achieve custom replication, by customizing the Change Data Capture (CDC) process. This way you can get near real time replication, but is a tricky thing.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas