Salesforce integration with others NO duplicates - api

Sorry, new to Salesforce platform. Trying to api integrate the two together for one environment. We use TrackWise a platform on top of the Salesforce development stack. Goal, migrate records abc from erp to trackwise custom objects.
Should seem simple to initial load. Read all records from SF, compare key from erp - if exists skip, if not add.
That's where the fun begins - limit return of objects from call is 1,000. I only have total 50,000 objects, but can't retrieve at beginning.
Logic would just say, check for key first, if exists skip, execpt hit limit 2 - only 200 queries per minute.
Should I just add a timer to my inserts? Is there a way to effectively make sure I do not insert a duplicate via an api call?
Well what would you do if your mother asked you?

Related

Hitting same API to get different number at the same time using concurrency?

I have created a web API to generate a sequence number every time that API hits it generates a sequence number. Now, what I need is to make it concurrent for multiple users so that when multiple users at the same time hit that API then the API generates a different number every time.
This depends a lot on what you are trying to do and for what reason.
Generating unique sequence numbers is very difficult in an environment where you can have multiple users hitting that endpoint at the same time.
If you are trying to give them an ID to use for some sort of data insert then I suggest you don't offer integers. Instead offer GUIDs.
The issue with inserting data based on this kind of mechanism is that sometimes no data is actually inserted for various reason. users change their mind, end up requesting another id, or the subsequent call simply fails so you end up with holes in your data.
Instead offer them back GUIDs and if the call finally comes in then use it.

Service that does advanced queries on a data set, and automatically returns relevant updated results every time new data is added to the set?

I'm looking for a cloud service that can do advanced statistics calculations on a large amount of votes submitted by users, in "real time".
In our app, users can submit different kind of votes like picking a favorite, rating 1-5, say yes/no etc. on various topics.
We also want to show "live" statistics to the user, showing the popularity of a person etc. This will be generated by a rather complex SQL where we are calculating the average number of times a person was picked as favorite, divided by total number of votes and the number of games in which the person has been participating etc. And the score for the latest X games should count higher than the overall score for all games. This is just an example, there are several other SQL queries with similar complexity.
All our presentable data (including calculated statistics) is served from Firestore documents, and the votes will be saved as Firestore documents.
Ideally, the Firebase-backend (functions, firestore etc) should not need to know about the query logic.
What I wish for is a pay as you go cloud service that does the following:
I define some schemas and set up the queries we need for the statistics we have (15-20 different SQLs). Like setting up views in MySQL
On every vote, we push the vote data to this service, which will store it in a row.
The service should then, based on its knowledge about the defined queries, and the content of the pushed vote data, determine which statistics that are affected by the newly added row, and recalculate these. A specific vote type can affect one or more statistics.
Every time a statistic is recalculated, the result should be automatically pushed back to our Firebase backend (for instance by calling an HTTPS endpoint that hits a cloud function) - so we can update the relevant Firestore documents.
The service should be able to throttle the calculations, like only regenerating new statistics every 1 minute despite having several votes per second on the same topic.
Is there any product like this in the market? Or can it be built by combining available cloud services? And what is the official term for such a product, if I should search for it myself?
I know that I can probably build a solution like this myself, and run it on a cloud hosted database server, which can scale as our need grows - but I believe that I'm not the first developer with a need of this, so I hope that someone has solved it before me :)
You can leverage the existing cloud services available on the Google Cloud Platform.
Google BigQuery, Google Cloud Firestore, Google App Engine (CRON Jobs), Google Cloud Tasks
The services can be used to solve the problems mentioned above:
1) Google BigQuery : Here you can define schema for the data on which you're going to run the SQL queries. BigQuery supports Standard and legacy SQL queries.
2) Every vote can be pushed to the defined BigQuery tables using its streaming insert service.
3) Every vote pushed can trigger the recalculation service which calculates the statistics by executing the defined SQL queries and the query results can be stored as documents in collections in Google Cloud Firestore.
4) Google Cloud Firestore: Here you can store the live statistics of the user. This is a real time database, so you'll be able to configure listeners for the modifications to the statistics and show the modifications as soon as the statistics are recalculated.
5) In the same service which inserts every vote, create a new record with a "syncId" in an another table. The idea is to group a number of votes cast in a particular interval to a its corresponding syncId. The syncId can be suffixed with a timestamp. According to your requirement a particular time interval can be set so that the recalculation can be triggered using CRON jobs service which invokes the recalculation service within the interval. Once the recalculation related to a particular syncId is completed the record corresponding to the syncId should be marked as completed.
We are leveraging the above technologies to build a web application on Google Cloud Platform, where the inputs are recorded on Google Firestore and then stream-inserted to Google BigQuery. The data stored in BigQuery is queried after 30 sec of each update using SQL queries and the query results are stored in Google Cloud Firestore to serve dashboards which are automatically updated using listeners configured for the collection in which the dashboard information is stored.

When to invalidate cache - .net core api

How do I know when to invalidate the cache, if a table change is made from an outside source?
I have an api call that returns an employee table. The first time this call is made, I will cache the results so that on subsequent calls it will pull the data from the cache instead of the database. This makes sense, however, what happens if someone adds a new record to the employee table from outside of the api, how does the cache know that it is now invalid?
If the user made the change to the employee table through the API I can capture that, but we have a separate desktop app that doesn't use the API, and that app can directly make changes to the employee table. Is there any accepted standards for handling this?
The only possible solution I can think of is to add a trigger to the employee table, and somehow use that to know when a table has changed. But, we have over a thousand tables, and we are making an api call for each table - So, I do not think that adding a thousand triggers to our database is an acceptable solution.
Yes you could add a trigger as suggested. Or you could use a caching system that support expiry time/sliding expiry. So you would be serving up stale data some of the time but not always.
As the other answer a suggests your trigger idea is ok, however as you've stated that would be a lot of triggers.
If your cache is not local to the API, which i assume it isn't if triggers would be able to access. Could you not access it from your desktop application? You could invalidate your cache by removing the employee record from the cache with the desktop application when it makes a successful change to the employee table.
It boils down to..
You have a cache (which is essentially a read store).
You have two options to update it
- Either it times out and fetches (which is ok, if you dont need up to the minute real time data)
- Or is has to be told its data is no longer valid.
Two ways to solve this
Push model
Pull model
Push Model: Using a database trigger for SQL server table to populate an intermediate audit table and polling that using a background task.
Pull Model: Using CLR Trigger and pushing the updates to an API. Whenever DML happens the CLR trigger will call the Api, qhich in-turn can update the cache!
Hope this helps!

Asp.Net MVC And SQL server - Best way of using automated jobs

I'm desinging a web based game. In this game almost all actions will take certain amounth of time but i'm not sure about where to store and execute the actions.
For example a character want to go to A to B and let's say this will take 30 secs. In my character table there is a column called Location, witch is storing Id of current place. So i must change this Id after 30 seconds.
The best solution i could me so far is creating SQL jobs. Since i don't have envoirment to test how 100.000 Sql jobs will effect the server performance, i wanted to ask is there any other ways or should i stick to Sql jobs?
PS: Logic is mostly same with other web based games, any direct example from others games about how they handle such things will be appreciated
Using sql database will cause you alot of pain later on because is not ideal for what you are attempting https://gamedev.stackexchange.com/questions/40215/use-a-sql-database-for-a-desktop-game
only use sql if you want to store vast amount of login details other than that use something similiar to couchbase
nosql database
http://www.couchbase.com/why-nosql/nosql-database
just my 2 cents hope i helped
You don't need any job for this.
If we stay at the example above then we can say that every place where our character is can have additional information (in an extra table where the places and the characters are connected) such as when start the validity of the record:
Player A is at Brighton from 2014-05-01T00:00:00 to NULL
but he is moving to London which takes 30 secs
Player A is at London from 2014-06-09T10:30:30 to NULL and the
previous place record will be closed (set the to value) with the
current from date (2014-06-09T10:30:30).
I implemented a simple scheduling mechanism using only ASP.NET. You can find a proof of concept at http://weblogs.asp.net/ricardoperes/using-the-asp-net-cache-as-a-scheduler.

Rest philosophy for updating and getting records

In my app I'm displaying Race objects that essentially have three states: pending, inProgress and completed. I want to display all Races that are currently pending or inProgress, but not the ones that are completed. To do this, I want to create a RESTful API for getting these resources from my server, but I'm not sure what the best (i.e. most RESTful) approach would be.
The issue is that when someone opens or refreshes the app, I need to two things:
Perform a GET on all the Races that are currently displayed in the client to update their status.
GET all of the new pending or inProgress Races that have been created since the client last updated
I've come up with a few different solutions, though I don't know which, if any, would be best:
Simply delete the old Race records on the client and always GET all new records
Perform 2 separate GET operations, the first which updates all the old records, and the second where I GET all the new pending / inProgress Races
Perform a single GET operation where I specify the created date of the last client record, and GET all records that are newer.
To me, this seems like a pretty common scenario but I haven't been able to find a specific answer to this type of problem. I'd like to see what SO thinks :)
Thanks in advance for your help!
Simply delete the old Race records on the client and always GET all new records
This is probably the easiest solution. However you shouldn't do that if you need a very smooth update on your client (for games, data visualization, etc.).
Perform 2 separate GET operations (...) / Perform a single GET operation where I specify the created date of the last client record, and GET all records that are newer.
I would definitely do it with a single operation. Better than an update timestamp (timestamp operations are costly, and several operations could happen at the same time), I would use a sequence number. This is the way CouchDB handles "changes".
Moreover, as you will see in the documentation, this solution can then be upgraded for asynchronous notifications (if you need so).