Lessons learned in Øredev 2014
Recently I visited the Øredev 2014 conference. I must say that it’s a pretty good one. Some of the talks are pretty boring, but most of them are actually pretty nice.
I was in the security training, and the three days conference. I tried to avoid going to the talks whose content I can just read up somewhere in the internet. I want the experiences from the speakers. That’s why if you see the points below, most of them is about how to do something best. Sometimes I don’t really have much choices, but when something like that happened, I will just switch rooms randomly. :)
And of course, the points below are not everything I’ve heard. It’s also not everything which is new to me. It’s more what I remembered (strongly) and think is interesting to share.
EU unified data protection standard. Every company who wishes to operate in EU or do business / sell stuffs to EU countries will have to comply. Example, users must be notified of a security breach in 24 hours after a breach is detected. Fine up to €100 millions or 5% of annual turnover.
Hacks can happen on networking level. Scripts can be added into a page, when it passed an evil router.
We need to categorized which data input can be trusted or not. Everything which is not trusted needs to be validated. Good approach, wrap untrusted data into a wrapper, and only allow the value to be read after a validator validates the data.
Phones are not secure. Even if you have the latest update, you phone can still be rooted by the application. After that, the malware will survive even a factory reset. Mac OS is also not secure. Thinking it is secure is a problem. We need to really protect our development machine properly whatever it is.
Always use SSL, there’s no good reason to not use SSL.
Man in the middle attack can happen with build system because it’s using http. Always check the signature of the package (not only the checksums). You don’t want your artifact to open a bean shell for remote access.
There are so many stuffs which can go wrong with file upload. Be very careful. Every validation you know can be bypassed. You checking content type based from magic bytes? You can be hacked because of that. Treat every incoming file like a dirty plasticbag you found on the street. On every file upload, rename the file, and save the metadata in the database. On download, return the file with the content type you saved when uploading!
The software definitely need a supply chain system. Cars company have a supply chain system, on defect, all stuffs can be pulled back efficiently. But on software, we don’t have any. Leaving so many softwares out there open for attacks. Many systems also can’t be updated (like old routers), which means, the security hole can’t be fixed, and you need to throw away the machines. This is not acceptable, we need a standard for this. A supply chain system.
Domain driven design is the way to go for continuous delivery. It also enables decentralized data management. One project one database. Also very useful and easy to replace. Let’s say you use MongoDB, and after 10 months, you decided for whatever reason that it’s a bad decision, then you can just replace the database with PostgreSQL because in your domain, you are the owner of the database, and everybody access the database through you.
Yesterday’s best practice might be tomorrow’s anti pattern. Example: Shared resources used to be a best practice. But they are now today’s anti pattern. Sharing stuffs makes it hard to decouple the system properly and it prevent isolated changes.
Reduce dependency as much as you can. Don’t even try to handle circular dependency. Tools like maven take care of this for you. But it will be very hard to debug on problem. Use testing tools like jDepend to write unit test about your dependency.
Prefer choreography over orchestration. Don’t use a main event bus to trigger every possible thing as if the main event bus is the conductor in an orchestra. Make each domain call the other domain directly if they need any information. Event bus should only be used for messaging.
Use consumer driven contract when doing choreography. There should be a unit test in the project that guarantees the backward compability of each consumer.
REST is better than SOAP because it’s backwards compatible. It’s more error prone, but the request is easier to shape and maintain.
Avoid monolith architecture. It’s hard to replace only a part of the system. Use microservices, it will give you much more flexibility. Also make sure you have discipline, because microservices without any discipline is just a mess.
Conway’s law. Any organization that designs a system… will produce a design whose structure is a copy of the organization communication structure. If the organization consist of an UI team, a DB team, and a backend team, you will get a system in which UI, DB, and backend are very isolated to each other. Mix the team so that there’s a DB, UI, backend guy, and operational in a team. This way, nobody will say it’s DB problem, or UI problem, or backend problem or operational problem. A team should ship a product, not a project. All the way from the lowest level which is DB to even the operational stuff should be done by the team. You build it, you run it.
Prefer BASE over ACID as your system becomes more distributed. ACID is not efficient and is not possible in a distributed environment because of the CAP theorem. The real world works using BASE and not ACID anyway. When paying in shop, you don’t give the money and take the stuff you bought at the exact same second. And please don’t think that transactions are free.
Releasing is not the same as deploying. You can deploy a project, or a feature, but you haven’t released it before you turned it on.
Reactive API is coming. It’s an open standard for reactive messaging system like Akka, Reactor, RxJava, Vertex, etc. They are all already working together and has defined an interface. It will be using a publisher subscribe system like usual. Read more in the internet.
Reactive is also not a new concept. It’s been there since a long time. And they’re actually just properties. As long as an application fulfill the four characteristics, which are responsive, resilient, elastic, and message driven. Then the application is reactive.
Asp.NET vNext is coming. It’s a very lightweight. Can also runs on mono in Linux and Mac OS X. Do you remember the snipper in the front page of nodeJs where it starts a simple server? You can also write a snippet like that now with Asp.NET vNext.
Too many stuffs can go wrong at the UI level. Make sure everything is secured properly in the backend.
Don’t go enterprise with Java EE. Go with a lightweight mindset. Throw away all the stuffs you don’t need. Don’t just include a library like Guava without thinking or using it extensively. Strip down everything you don’t need, make it fast. Make sure you have a very small pom file, most of the stuff you don’t actually need. Delete first, ask later. Focus more on business logic.
Unit test a lot, but don’t unit test everything. Only the complex business logic. Try to unit test every test you can are going to write. Most of the time you will not need an integration test or an arquillian test? Do you need entity manager? You can just call it directly without arquillian, just use normal java.
Integration tests are not very useful except for testing Java EE related stuff. And they’re very slow. Avoid them as long as you can.
Try to break your application every night. There are a lot of problems that only show their selves when the system is about to stress-tested.
Only use patterns if you have a problem. If you notice in books, every patterns are trying to solve a problem. So they have a problem already. Don’t just assume that you have problems from the very beginning. “I used all your patterns from your book.” “Are you crazy? You can’t possibly have all the problems in the book!”
Parkinson’s law of triviliality. The time spent on any item of the agenda will be in inverse proportion to the sum [of money] involved. Basically, the bigger you are, more trivial stuff you discuss. Avoid that.
Use whatever process that works. Don’t be blindly blinded by a process.
Use standard stuff whenever you can. Don’t try to think too much if you don’t need it.
Modules installed by NPM doesn’t always continue to work perfectly. Always use npm shrinkwrap to lock down dependencies of what’s running locally. It will create a special json file which has all the current exact version of the current running system. If you want to update them, use npm outdated to find out which one can be updated.
Use npm install –production to make your build faster if you’re not developing.
Avoid global installs. Some times your project just works because you have some package which is installed globally. At some point of time, everything will be so confusing.
Use the async library to take care of your codeflow. It’s really useful for complex callbacks or promises.
Try and catch doesn’t really work with async stuff. A new feature called Zone is coming, and it will solve the problem. Google for it!
Let node handle the business logic, forget about http stuffs with node. Forget caching and all those stuffs. Let a real webserver like nginx take care of it.
Check the query generated by your ORM, sometimes they are inefficient. You need to also take care of your ORM.
ES6 will standardize the current modules system which consist of the very different AMD and CommonJS
Your CPU is really fast. Most of the time is used to wait, or figuring out what to do next before it happened. Depending on how big your data is, and where it’s being executed from, the speed of your application will change. There’re L1, L2, L3, RAM, and HD. Using a beer drinking analogy, L1 is like the beer in your hand. L2 is like the beer cooler by the sofa. L3 is your fridge. Main memory is like walking to the store just below your apartment. Disk access is like taking the plane flying from europe to the USA to buy the beer.
Using more cores can slow down your application because it will have to mark the data being shared as shared, and will have to properly invalidate all of those stuffs and putting them into shared state again.
Branch prediction is very important, because your CPU is waiting too much. So, if most of your branch prediction fails, you will experience a slow down.
Know your CPU, sometimes it matters.
Project Orleans is the new actor model framework. Scalability is linear. Halo 4 backend services are developed with this framework, and it didn’t experience any major downtime on launch. 11,6 million players. 1,5 billion games. 270 million hours. Fulfills the AP points of the CAP theorem.
DB development, migration, and scripts are all part of the products. All database related stuffs need to be put into version control.
Use tools like DB Deploy or Liquibase to roll out database changes. No manual tweaking. If a hotfix is needed, then this hotfix should also be put into version control.
Database table should be able to support multiple versions of the application. Let’s say we have a full name field. Later we change it to first name and last name. Even though we have the new column which are first name and last name, we should also leave the full name field. We should use a trigger to update all of the columns appropriately.
Use views as an interface.
Spray.io will be integrated into Akka. Play mini as well. Deadline is Q1 2015. So just wait for it and use Akka if it’s possible.