Looking for Reasons NOT to put SharePoint 2010 in Cloud - sharepoint-2010

I am looking for a laundry list of reasons why a large company with 24000 employees would NOT want to put their Primary SharePoint system for their internet into the cloud?
What are the limitations and challenges compared to operating your own farm servers.
Thank you for your thoughts.

Depending on the provider, they might limit your choice of webparts, addons, solutions, etc.
Check for:
Addons that contain unmanaged code (BPOS does not allow this, for example)
Addons that need to elevate privileges
Anything that needs to run in a full-trust environment
Ask about any other possible limitation. At 24k users, you probably are only looking at high-end providers, but ask, just in case.

You mean apart from the fact that hosting your entire sensitive company data, trade secrets, potential HR data at a third party that may or may not do a good job securing it from other customers on the same cloud may be just a tiny little risk?
Or that if the provider has an outage (like the Amazon S3 blackout yesterday) leaves you somewhat powerless and at the mercy of the provider?

Related

Cloud scale user management

I am building a service to handle a large number of devices, for a large number of users.
We have a complex schema of access roles assigned to each entity. Some data entries can be written to by certain users, while some users can only read from some entities (but can write to others).
This is a cloud service: there are more devices, and users than can be handled by a single server machine (we are using non relational Cloud databases for this).
I was wondering if there was an established cloud-scale user/role management backend system which I could integrate to enforce the access rules, instead of writing my own. This tech should preferably be cloud agnostic, so I would prefer not to use a SAAS solution, but deploy my own.
I am looking for a system which can scale to millions of users, and billions of data entities
I think authentication is not going to be a big issue, there are very robust cloud based solutions available for storing identities and authenticating millions of users. Authorization will be trickier, and will depend a lot on how granular you want it to be. You could look at Apigee for example as a very scalable proxy that might help you implement this. So getting to the point where you have a token that you can verify the users identity with and that might contain some scopes is not going to be hard imo. If that is enough for you then I would just look at Auth.0, Okta and the native IDM solution of whatever cloud platform you are using (Cognito, Cloud Identity etc.).
I think you will find that more features come with a very hefty pricetag. So Auth.0 is far superior compared to Cognito, but Cognito still has enough features for basic use cases and will end up costing a fraction of Auth.0 in large deployments. So everything comes with pros and cons. If you have very complex requirements such as a bunch of big legacy repositories that you need to integrate then products like Auth.0 rapidly start looking more attractive.
Personally I would look at Auth.0, Cognito and Apigee and my decision would depend massively on parameters that you haven't mentioned in your question. Obviously these are all SaaS solutions, which I think you should definitely be using anyways. I would not host this myself unless I had no other choice, and going that route will radically limit your choices and probably increase expenses. All the cool stuff is happening in the cloud.

Working with patient/customer data outside of the office

Background
I am a developer that works for a health care organization. We build a variety of business apps that a majority of them contain PHI (Patient Health Information). We work on laptops in-house and occasionally have the option to work from home. Something we are discussing though is how do we handle the data stored on our laptops when we are working out of the office.
Although we have passwords and our laptops are encrypted that still doesn't seem like enough to us to protect data. What I mean by that is this. We are a small five person team. When we are working on a task we all work locally on our own databases, on our laptops. When the change is done we commit to svn and publish to a test server. Our concern is my local database is a copy of production sometimes so I can test against real data. That local database could contain thousands of records of PHI. This is obviously a major concern to us when we takes our laptops out of our building because if I have my laptop stolen, I would be putting thousands of patients health information at risk. Not something we want to do.
My Question
How do developers work as a best practice in regards to patient data safety. Or even if it was financial? Either way, how do people work with patient/customer data locally?
Is it fair to say that sometimes you just don't have the ability to connect in to a database behind a firewall or is that just negligence? Even if I keep the database internal I still have project code on my laptop. Is that bad too?
• Should I have fake data?
• Should all data be on an internal machine that you connect to?
• Should I only connect in to a machine that is internal?
I can’t imagine that is what people do all the time.
We are discussing this as a team and would love to hear your feedback in regards to "how do you or anyone work as a remote developer".
Thanks

Perforce: Any side-effects to sharing Login accounts / Client-Specs among multiple users?

I am currently working on a file system application in C# that requires users to login to a Perforce server.
During our analysis, we figured that having unique P4 login accounts per user is not really beneficial and would require us to purchase more licenses.
Considering that these users are contractual and will only use the system for a predefined amount of time, it's hard to justify purchasing licenses for each new contractual user.
With that said, are there any disadvantages to having "group" of users share one common Login account to a Perforce server ? For example, we'd have X groups who share X logins.
From a client-spec point-of-view, will Perforce be able to detect that even though someone synced to head, the newly logged user (who's on another machine), also needs to sync to head ? Or are all files flagged as synced to head since someone else synced already ?
Thanks
The client specs are per machine, and so will work in the scenario you give.
However, Perforce licenses are strictly per person, and so you will be breaking the license deal and using the software illegally. I really would not advocate that.
In addition to the 'real' people you need licenses for, you can ask for a couple of free 'robot' accounts to support things like automatic build services, admin etc.
Perforce have had arrangements in the past for licensing of temporary users such as interns, and so what I would recommend is you contact them and ask what they can do for you in your situation.
Greg has an excellent answer and you should follow his directions first. But I would like to make a point on the technical side of sharing clients on multiple machines. This is generally a bad idea. Perforce keeps track of the contents of each client by client name only. So if you sync a client on one machine, and then try to sync the same client on another machine, then the other machine will only get the "recently" changed files and none of the changes that were synced on the first machine.
The result of this is that you have to do a lot of force syncing. Or keep track of the changelists you sync to and do some flushing and then syncing.

How to achieve high availability?

My boss wants to have a system that takes into concern of continent wide catastrophic event. He wants to have two servers in US and two servers in Asia (1 login server and 1 worker server in each continent).
In the event that earthquake breaks the connection between the two continents, both should work alone. When the connection is revived, they should sync each other back to normal.
External cloud system not allowed as he has no confidence.
The system should take into account of scalability which means addition of new servers should be easy to configure.
The servers should be load balanced.
The connection between the servers should be very secure(encrypted and send through SSL although SSL takes care of encryption).
The system should let one and only one user log in with one account. (beware of latency between continent and two users sharing account may reach both login server at the same time)
Please help. I'm already at the end of my wit. Thank you in advance.
I imagine that these requirements (if properly analysed) are essentially incompatible, in that they cannot work according to CAP Theorem.
If you have several datacentres, even if they are close by, partitions WILL happen. If a partition happens, either availability OR consistency MUST be lost, because either:
you have a pre-determined "master", which keeps working and other "slave" DCs which fail (or go readonly). This keeps consistency at the expense of availability.
OR you lose consistency for the duration of the partition (this means that operations which depend on immediate consistency are also unavailable).
This is incompatible with your requirements, as far as I can see. What your boss wants is clearly impossible. He needs to understand CAP theorem.
Now, in YOUR application case, you may decide that you can bend the rules and redefine what consistency or availiblity are, for convenience, and have a system which degrades into an inconsistent but temporarily acceptable state.
You probably want to get product management to have a look at the business case for these requirements. Dropping some of them is probably ok. Consistency is a good requirement to keep, as it makes things behave as people expect - this means to drop availability or partition-tolerance. Keeping consistency is definitely easier from an engineering perspective.
This is another one of those things where employers tend not to understand the benefits of using an off-the-shelf solution. If you as a programmer don't really even know where to start with this, then rolling your own is probably a going to be a huge money and time sink. There's nothing wrong with not knowing this stuff either; high-availability, failsafe networking that takes into consideration catastrophic failure of critical components is a large problem domain that many people pour a lot of effort and money into. Why not take advantage of what providers have to offer?
Give talking to your boss about using existing cloud providers one more try.
You could contact one of the solid and experienced hosting provides (we use Rackspace) that have data centers in different regions world wide and get their recommendations upon your requirements.
This will require expert assistance and a large budget, and serious planning.
I better option will be contact a reputable provider with a global footprint and select a premium solution with a solid SLA backing up there service and let them tailor a solution that comes close to your needs.
Just realize even the guys like Google, Yahoo, Microsoft and Amazon (to name a few), at one time or another have had some or other issue that rendered segments of there systems offline to certain users.

What does LDAP solve?

I've been in touch with LDAP in many projects I've been involved in but, the truth be told, I don't really understand it. I thought it was just a person directory but after I discovered that it can contain any objects in a hierarchical structure.
I installed openldap in my box and I found many tutorials regarding just the installation.
What is LDAP? What are the scenarios where LDAP is the right choice? What are the LDAP concepts I should know for working with it? What are the advantages of LDAP? Is it used just because old applications used it? Is there a good doc anywhere on internet explaining all this questions?
UPDATE:
Complementing the answers I found this link which contains a quick start guide for LDAP newbie like me.
What is LDAP? What are the scenarios where LDAP is the right choice?
At its core, LDAP is a protocol for accessing objects that are suitable for storage in a directory. Whether something is "suitable" is an entirely subjective determination that's left up to implementers, but typically this means collections of many objects that each have infrequently (or never) updated data, where each object has an obvious or canonical way to be looked up:
a phone book (look up by name or by phone number)
titles in a library (look up by title, author, etc.)
tenants in a building (look up by floor, suite, name, etc.)
and so on.
Note that LDAP itself is just a protocol and doesn't provide any actual storage -- in much the same way, HTTP doesn't imply anything about whether you're using Apache, Jetty, Tomcat, Mongrel, et al. as a web server. (One problem with LDAP in general is the confusing reuse of names to mean different things. Wikipedia has a good section on this.)
DITs are a hierarchical description scheme that lend themselves to B-Tree algos very nicely, resulting in tremendous search performance in most cases. Directory Server like OpenDS return indexed searches in micro-seconds, whereas RDBMS systems are much slower. Directory Servers (often called LDAP servers) trade resources (RAM, CPU) for fast read response. RDBMS systems provide greater functionality in terms of management of data in question. Need speed with few or zero updates, simplicity, and small network protocol? Use a Directory Server. Need data management and mining capabilities, and/or high rate-of-change of the database with relational aspects defined between data? Use an RDBMS (MySQL is your best bet here).
LDAP has O(1) read performance, in exchange for O(something worse) write performance. It's ideal for data that's accessed frequently, but changed rarely - directories of people, machine names and addresses, and so on. (hence the acronym: Lightweight Directory Access Protocol.)
LDAP is the right choice where the pain of using a database that isn't relational, in terms of decreased developer familiarity and strange performance characteristics, is less than the gain of blindingly fast read access.
This link will explain LDAP http://blogs.oracle.com/raghuvir/entry/ldap
We use LDAP in our office for email address lookups company wide. We use it as a single source sign on service for our internal apps as well.
One perspective I like to harp on is LDAP is an app on top of a persistence store and a database is a persistence store. Both can be used to store user information.
LDAP gives you a hierarchy which is harder to do in a database. You can make a hierarchy in a database but it's harder to do things like delegation (these rows belong to you only) or ACLs on rows. So pushing security problems out of the database is easier if you use LDAP for storing user identities. Trying to solve it in the database is weird.
At the same time, LDAP is terrible for reporting against (transform LDAP to a DB for reporting). Storing attributes deep in the tree that need to be searched quickly can be problematic for performance (don't do this, have a DB on the side or try to flatten the query by redesigning your DIT). Storing attributes all over the place in a really deep DIT is just bad LDAP or system design but sometimes it's unavoidable if you're tied to a vendor product or legacy app.
LDAP is just a protocol, the wikipedia article explains it adequately http://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol
Its a way to query an underlying organizational structure like Microsoft's Active Directory. You can use LDAP queries to get all kinds of information about users, use it for setting application rights, etc.
I am working part time and a full time student. My curriculum encourages (read requires) many group projects.
I have used openLdap and phpLdapAdmin to control access to my Subversion and Mercurial repos, Trac projects, Hudson, etc. It wasn't easy to install, but the time saved in administration was a God send.
If you have projects where you will have many groups of people who need to be able to use different resources, it is a good tool.
See this link :
http://www.umich.edu/~dirsvcs/ldap/doc/guides/slapd/1.html#RTFToC1
Which explains deeply LDAP :
For example you can see this image in that documentation ,
(source: dirsvcs at www.umich.edu)
LDAP is an access protocol; it only provides an API to the underlying technology for which you are trying to find applications - a directory service. OpenLDAP is one of the open source directory services; Sun has another implementation called OpenDS. Active Directory and Novell NDS are another two commonly seen in the field.
The directory can be used for storing information about any sort of resource, and the relationships between the resources - for example, rights of a user to a directory, a printer, or a network access device.
Is there a good doc anywhere on internet explaining all this questions?
IBM published an excellent Red Book about LDAP. The title is:
Understanding LDAP - Design and Implementation.
It can be downloaded from the previous link.
In one of my old workplaces we used LDAP as our primary user authentication system.
This in turn provided our various systems with information which dept. they belonged to, where they should mount their home directories, contact information, employee management.
Not necessarily controlled by LDAP, but other things that we had mixed to work through LDAP was the existence of SQL users, K4, samba and email account generation.