I have been putting off developing this part of my app for sometime purely because I want to do this in a circular way but get the feeling its a bad idea from what I remember my lecturers telling me back in school.
I have a design for an order system, ignoring the everything that doesn't pertain to this example I'm left with:
CreditCard
Customer
Order
I want it so that,
Customers can have credit cards (0-n)
Customers have orders (1-n)
Orders have one customer(1-1)
Orders have one credit card(1-1)
Credit cards can have one customer(1-1) (unique ids so we can ignore uniqueness of cc number, husband/wife may share cc instances ect)
Basically the last part is where the issue shows up, sometimes credit cards are declined and they wish to use a different one, this needs to update which their 'current' card is but this can only change the current card used for that order, not the other orders the customer may have on disk.
Effectively this creates a circular design between the three tables.
Possible solutions:
Either
Create the circular design, give references:
cc ref to order,
customer ref to cc
customer ref to order
or
customer ref to cc
customer ref to order
create new table that references all three table ids and put unique on the order so that only one cc may be current to that order at any time
Essentially both model the same design but translate differently, I am liking the latter option best at this point in time because it seems less circular and more central. (If that even makes sense)
My questions are,
What if any are the pros and cons of each?
What is the pitfalls of circular relationships/dependancies?
Is this a valid exception to the rule?
Is there any reason I should pick the former over the latter?
Thanks and let me know if there is anything you need clarified/explained.
--Update/Edit--
I have noticed an error in the requirements I stated. Basically dropped the ball when trying to simplify things for SO. There is another table there for Payments which adds another layer. The catch, Orders can have multiple payments, with the possibility of using different credit cards. (if you really want to know even other forms of payment).
Stating this here because I think the underlying issue is still the same and this only really adds another layer of complexity.
A customer can have 0 or more credit cards associated, but the association is dynamic - it can come and go. And as you point out a credit card can be associated with more than one customer. So this ends up being an n:m table, maybe with a flag column for "active".
An order has a static relationship to 0 or 1 credit card, and after a sale is complete, you can't mess with the cc value, no matter what happens to the relationship between the cc and the customer. The order table should independently store all the associated info about the cc at the time of the sale. There's no reason to associate the sale with any other credit card column in any other table (which might change - but it wouldn't affect the sale).
I think the problem is with the modeling of the Order. Instead of one Order has one credit card, an order should be able to be associated with more than one credit card of which only one is active at any time. Essentially, Order and Credit is many-to-many. In order to model this in DB, you need to introduce an association table, let's say PaymentHistory. Now when an order requires a new credit card, you can simply create a new credit card, and associate it with the order and mark the associating PaymentHistory as active.
Hmm?
A customer has several credit cards, but only a current one. An order has a single assigned card. When a customer puchases something, his default card is tried first, otherwise, he may change his main card?
I see no circular references here; when a user's credit card changes, his orders' stay the same. Your tables would end up as:
Customer: id, Current Card
Credit Cards: id, number, customer_id
Order: id, Card_id, Customer_id
Edit: Oops, forgot a field, thanks.
No matter the reason your data has circular relationships, you'll be a lot happier if you "forget" to declare one of them so that your tables have a bulk-load order.
That comes in handy when you least expect it.
This is a year old but there's some points worth making.
NB For on-line NON-ACCOUNT processes: The Customer would be better defined as Buyer and there would also probably be another type of customer - the Beneficiary/Recipient. You can buy/purchase airline tickets and flowers etc. for other people so these two roles need to be clearly separated as they involve different business processes (one to pay and the other to be sent the goods).
If it is a non-account process then you shouldn't be retaining credit card details. It's a security risk - and you're putting the buyer at risk by keeping this information. Credit cards are processed in real-time and then the information should be thrown away.
ACCOUNT CUSTOMERS: The only exception would be when someone has opened an account and provided their credit card information for use in subsequent purchases. In such a case changes to the credit card information would take place outside of the transaction - as part of the Account Management process.
The main point is to make sure that you fully understand the business processes before you start modelling and coding.
Related
I am making an online market for a learning project. Pardon me for not having a diagram.
I have the tables Seller and Product,which contains data about the seller and products, respectively. A Seller can have multiple products. There is also a Receipt table that stores information regarding purchases made by a customer. This is an important record and must persist. The receipt should be able to have information on the item purchased.
However, products are dynamic, products may be added and removed. But since the Receipt should reference the Product, it means that I should not delete a Product row even if it is no longer on sale.
Is this the right way to do it? Are there any better design pattern I can use?
Yes, that is the right way to do it. If you set the referential integrity right, the system will not allow you to delete a product or seller if it has receipts. The next thing to do is to use a flag to mark the product or seller as deleted or archived. It could be either a boolean or a date that indicates when it became inactive. Using a 'From' AND a 'To' date to indicate valid time intervals, as Hellmar Becker suggests, is very powerfull, but it opens a whole new can of worms: you can have more than one 'valid' period, so you have to extend your primary key.
Modern databases like HANA (from SAP) just don't allow deletes any more, and have inbuilt 'deleted' flags.
This isn't a proper answer. I just want to give you the gift of a diagram since you didn't have one! :)
(Disclaimer: QuickDatabaseDiagrams is my project)
I tried looking at similar StackOverlow posts and it seems as those questions for input about schema is valid. Also, I'm a software developer and not a DB expert by trade. So hopefully this is met well.
I'm using SQL Server, though I think this question is generic enough that it might be applicable to pretty much any SQL product as it pertains to what's the best schema for my scenario.
I'm writing a referral payment system whereby stores may credit and pay back individuals who refer customers. The entities are -
Referrer: the one to be paid for referring customers,
Referral: the customer that was referred
Referral Purchase: The amount and date of the referral's purchase.
Admin: the one doing the paying.
When determining what to pay the referrer I need to tally up all of the referral purchases that have not been credited. The sum at the time of the pay out attempt is what gets paid.
The confounding part of this whole thing is that when an Admin makes a payment, it may fail for any number of reasons (insufficient funds, the referrer gave bad PayPal information, etc.). All of this needs to be stored so that I can not only look back over past payment attempts and determine the failures and what referral purchases were involved in the failure, but also to determine which referral purchases have yet to be credited to the referrer.
The best schema I've been able to devise is the following:
The point here is that each PaymentAttempt holds the status of the payment attempt (success/failure) and each Referral Purchase that was credited in the payment attempt has a link table which associates it with the payment attempt. One referral purchase may, then, be involved in any number of attempts to credit the referrer, with the last one being the successful attempt.
Ultimately my question comes down to this: when I need to go back and then determine how much the referrer needs to be paid at a later date, is it going to be a pain in years to come if I need to query ALL of the ReferralPurchases associated with the referrer, then join ALL of the ReferralPurchase/PaymentAttempt link tables, then join the associated PaymentAttempt status tables to find out which of the referral purchases have yet to be credited? I could see myself needing to create pretty weird queries just to find those five purchases that have yet to be credited.
Alternatively I could update the ReferralPurchase itself with a status flag, but is this considered "asking for it" in terms of data integrity (I think I could see some saying this is poor design since the state could be queried in other ways, and perhaps a bug might result in the bit being set without proper records to warrant it)? Is that bad design?
Or is there some better way to lay things out?
Will try my best to help you out, hopefully I understand your question correctly. If I were designing the system, there would be two tables that stand out for me. The tables and their columns are.
ReferralPurchase
• ReferralPurchase_Id (PK)
• Referrer_Id (Pointing to a person table)
• Referral _Id (Pointing to a person table)
Payment
• Payment_Id (PK)
• ReferralPurchase_Id (FK)
• AmountToBePaid
• StatusOfPayment
• DateLogged
• DatePaymentMade (Null if status is not successful)
• Admin_Id (Pointing to a person table)
Ben, not sure what you mean by status field. I would steer away from lifecycle status fields, but would consider a boolean. For example:
An isPaid flag on ReferralPurchase would seem like a reasonable approach. It should only be updated on a confirmed payment, and if there is a query on why it has been set, the evidence will exist in the form of history from the PaymentAttempt and link tables. This would simplify queries of outstanding payments, and pending payments would just be incomplete PaymentAttempts. There is the theoretical possibility that the history could contradict the value of the flag.
Alternatively, you could have an isSuccessful flag on the link table, which is "closer to source", if I can put it that way, in that it cannot as easily be in conflict, as it is the history itself (as long as the coder does not allow more than one row to be marked isSuccessful for a given ReferralPayment for example). Finding outstanding payments is just those ReferralPayments where not exists an isSuccessful link record.
Others will have different views on this. Let us know which way you go.
OK...I am hoping this is a classic problem that everyone knows the answer to already. I have been building a mysql database (my first one) where the main purpose was to load line-item data from an invoice and related data from the matching remittance and reconcile the two. Basically, everything has been going along fine until I discovered a problem.
Details: I have thus far identified individual invoice line items with a client (to be billed) id, service date, and service type and matching that transaction against the remittance transaction with the same client ID, service date and service type. Unfortunately, there are times (I just discovered) when one client (ID) gets multiple instances of a particular service on the same day and thus my invoices are not unique based on the three components I just mentioned.
There is another piece of info on the invoice (service time) that could be used to make invoice items unique, but the remittance does not include service times (thus I cannot match directly against it using service time). Likewise, the remittance has another piece of info (claims ref number) that uniquely identifies remittance items. But of course, the claims ref number is not on the invoice.
Is there some way to use an intermediate table perhaps that can bridge this gap? Any help, answers or helpful links would be most appreciated. Thanks in advance.
This is perhaps more a business problem then a technical one-- it sounds like there is in fact no reliable way to match up remittances and invoices, unless something like matching on the dollar amount works. If you use an artificial key on the invoice you kind of solve the technical problem but not the business one.
If you can't change the business process at all and there is no technical way to match remittances and invoices, you might be forced to treat all invoices for a customer/service date/service type as a unit; make each invoice a part of that unit, and then group all the remittances and all the invoices that match that unit together.
You can make life easy on yourself and create an Invoice ID and remove the composite key all together.
Any type of fix is going to have an impact on the calling code, as increasing the field count on the composite key implies that this new field needs to be supplied, so I suggest just creating an invoice ID.
Many IT professionals that work with RDBMS will suggest to never use natral keys. Always use a surrogate key (like an auto-increment column)
I agree with #antlersoft (+1), this sounds mostly like a business problem: how to “match up” items within two separate sets of data that cannot be clearly and cleanly matched up with the data provided.
If the “powers that be” (aka your manager/supervisor/project owner) cannot or will not make this decision, and if you have to do something, based on the information provided I’d recommend matching same-day items like so:
lowest invoice-item service time with lowest remittance claims ref number
next-lowest invoice item service time with next-lowest remittance claims ref number
etc.
(So when you have such multiple-per-day items, do you always have the same number of invoice items and remittances? Or is that going to be your next hurdle?)
Once you know how to implement “matching up” items, you then have to implement it by storing the data that supports/defines the assocaition within the database. Assuming tables InvoiceItem and Remittance, you could add (and populate) ServiceTime in the Remittance table, or ClaimsRefNumber in the InvoiceItem table (the latter seems more sensible to me). Alternatively, as most people suggest, you could add a surrogate key to either (or both) tables, and store one’s surrogate key in the other’s table. (Again, I’d store, say, RemittanceId in table InvoiceItem, as presumably you couldn’t have a Remittance without an InvoiceItem – but it depends strongly upon your buseinss logic.)
Pawnshop business model:
CLIENTES (customer table), LOTES (lot table), ARTICULOS (item table) and TRANSACCIONES (transaction table).
The reason I defined a lot table is because when customers pawn or sell items, the pawnshop groups all these items into one lot, calculates the total loan or purchase amount, stores these values under one transaction and prints the ticket with a description of all the items and total amount. So I want the ability to say, if customer defaults on interest payments or does not redeem pawn, then customer forfeits all items and pawnshop may choose to sell some items to gold refinery and/or transfer other non-gold items to inventory to sell to the public. In other words, I want the ability to do a break-out explosions of each item.
Would the above ER be adequate for this capability?
From the point of view of a logical model, you probably don't want store_id on the lot (as it comes from the customer) or the transactions or articles (as they get it through the lot and customer). At the physical level you might have those as attributes (called denormalisation), you have the risk of data showing, for example, LOT 1234 being on CUSTOMER C12 and at STORE S1, while the customer table has C12 being at store S2.
Of course it is possible that you allow Mr Smith to pawn an item at one store but make payments on it at another. Or perhaps an item might be pawned at one store but physically relocated to a different one for security or space reasons. If so, then it is appropriate to have distinct store ids on these entities.
However that doesn't sit comfortably with the 'store' being an attribute of the customer, since that implies they have a relationship with only one store.
Also consider what happens if MR P BROKER has three stores, but decides to close one and move the business to one of the others. You need to merge the stores but do you update the store id on all the transactions and articles and lots (including ones that are 'in progress' and those redeemed) or do you leave them with the original store id ?
Another common data modelling issue is identifying customers. Is Mr Smith one customer and Mrs Smith another, or can Mr and Mrs Smith be 'parts' of the same customer ? If Mr Smith pawns something, can Mrs Smith redeem it ? I'm thinking family squabbles, disputed heirlooms.... Perhaps she can't redeem it, but can make payments on it.
If an item (eg a watch) is included in one lot, then redeemed, then included in a different lot, does it get a different item_id ?
When a client buys an article offered to the general public, is that a transaction? Or does your database only track transactions about lots?
Can an item exist in your system without being part of any lot? You can't express that fact in the ER model you've presented.
Your ER model doesn't show any many to many relationships. That makes me suspicious. I've never worked in a pawnshop, so I can't say for sure. But every other enterprise database I've ever seen has at least one many-to-many relationship. Sometimes a relationship is treated as though it were an entity, and appears with a box of its own. But that box would be on the "infinity" end of more than one relationship, something I don't see in your diagram.
Buena suerte.
I'm curious on how some other people have handled this.
Imagine a system that simply has Products, Purchase Orders, and Purchase Order Lines. Purchase Orders are the parent in a parent-child relationship to Purchase Order Lines. Purchase Order Lines, reference a single Product.
This works happily until you delete a Product that is referenced by a Purchase Order Line. Suddenly, the Line knows its selling 30 of something...but it doesn't know what.
What's a good way to anticipate the deletion of a referenced piece of data like this? I suppose you could just disallow a product to be deleted if any Purchase Order Lines reference it but that sounds...clunky. I imagine its likely that you would keep the Purchase Order in the database for years, essentially welding the product you want to delete into your database.
The parent entity should NEVER be deleted or the dependent rows cease to make sense, unless you delete them too. While it is "clunky" to display old records to users as valid selections, it is not clunky to have your database continue to make sense.
To address the clunkiness in the UI, some people create an Inactive column that is set to True when an item is no longer active, so that it can be kept out of dropdown lists in the user interface.
If the value is used in a display field (e.g. a readonly field) the inactive value can be styled in a different way (e.g. strike-through) to reflect its no-longer-active status.
I have StartDate and ExpiryDate columns in all entity tables where the entity can become inactive or where the entity will become active at some point in the future (e.g. a promotional discount).
Enforce referential integrity. This basically means creating foreign keys between the tables and making sure that nothing "disappears"
You can also use this to cause referenced items to be deleted when the parent is deleted (cascading deletes).
For example you can create a SQL Server table in such a way that if a PurchaseOrder is deleted it's child PurchaseOrderLines are also deleted.
Here is a good article that goes into that.
It doesn't seem clunky to keep this data (to me at least). If you remove it then your purchase order no longer has the meaning that it did when you created it, which is a bad thing. If you are worried about having old data in there you can always create an archive or warehouse database that contains stuff over a year old or something...
For data like this where parts of it have to be kept for an unknown amount of time while other parts will not, you need to take a different approach.
Your Purchase Order Lines (POL) table needs to have all of the columns that the product table has. When a line item is added to the purchase order, copy all of product data into the POL. This includes the name, price, etc. If the product has options, then you'll have to create a corresponding PurchaseOrderLineOptions table.
This is the only real way of insuring that you can recreate the purchase order on demand at any point. It also means that someone can change the pricing, name, description, and other information about the product at anytime without impacting previous orders.
Yes, you end up with a LOT of duplicate information in your line item table..; but that's okay.
For kicks, you might keep the product id in the POL table for referencing back, but you cannot depend on the product table to have any bearing on the paid for product...