Tech Blogs: 2015

Aug 15, 2015

Spring Data JPA repository method examples

Once you have created a core project as shown in last blog. Now in this tutorial we will check various methods for various SELECT queries. So we will check IN, GREATER THAN, LESS THAN etc in this blog. Following is the JUnit Test class
Following is the repository
Service interface
Service implementation

Following log is for reference.

2015/08/15T23:38:49,203 DEBUG [com.techcielo.sphr.core.test.CountryServiceTest] [testFindByColumn] - Finding Records for GNP > 50000
Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.gnp>?
2015/08/15T23:38:49,382 DEBUG [com.techcielo.sphr.core.test.CountryServiceTest] [testFindByPartialCode] - Finding Records for Code LIKE AR)
Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.code like ?
2015/08/15T23:38:49,393 DEBUG [com.techcielo.sphr.core.test.CountryServiceTest] [testFindSingle] - Finding Single Record by Code = IND
Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.code=?
2015/08/15T23:38:49,402 DEBUG [com.techcielo.sphr.core.test.CountryServiceTest] [testFindByCodeList] - Finding Records for Code in (IND,USA)
Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.code in (? , ?)
2015/08/15T23:38:49,415 DEBUG [com.techcielo.sphr.core.test.CountryServiceTest] [testFindByCodeAndName] - Finding Records for Code=IND and CountryName=India)
Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.code=? and countrybea0_.name=?

For exhaustive list please refer spring JPA reference document

Aug 14, 2015

Using Spring Data JPA with Hibernate

Once we create our core workspace as explained in Last artical. Now we will create few core components that will form the bases of any project that operates of data exposed by this project or maintains various operations on this DB. So now we create our spring configuration file countries-app-context.xml as shown below.

As one can notice this configuration only contains core information. Information specific to each module will be moved to separate config file. For an example information about the db will be maintained in separate configuration file called countries-jpa-context.xml as follows

Once this structure is created let's at high level decide what we need to create. We have total 3 entities Country, City and Country Language so for these three entities (to make it simple one to one relationship between DB table and a java bean) we need to create three java beans. To access these beans we need to have a repository and to use that repository and perform some business logic we need to have a service class. Last but not the least to test the entire flow we will need our JUnit classes.

So first lets start with creating interfaces, in the next step we will create test cases in Junit based on DB information we have and once that is ready we will create implementation classes. In each service interface we will have 2 methods first method will find a record by primary key and second method will find records by one of column value for table. So following will be the structure.

For simplicity we will create flow for only one entity (country) and interested user can do the same for City. CountryLanguage has some special scenario that we will cover in later tutorials. First we will create a bean class for country

Next we create service interface as shown below.

Now create a test case.

So once we have the test case ready we will do the actual development (not quite common but this is how TDD or Test Driven Development suppose to work). First we create a repository interface which will extend JpaRepository

Next we will create service implementation and repository. Service implementation will be as shown below.
Few cool features of repository are

They are interface and DO NOT PROVIDE IMPLEMENTATION FOR THEM.
You need to follow just convention of defining your method name in interface.
findByCode will result in query "SELECT * FROM COUNTRY WHERE Code=?"
findByGnpGreaterThan will result in "SELECT * FROM COUNTRY WHERE gnp>?"

Now run your test case and following will be the output.

Following log is for reference.

2015/08/15T01:11:28,231 DEBUG [org.jboss.logging] [<clinit>] - Logging Provider: org.jboss.logging.Log4jLoggerProvider

2015/08/15T01:11:29,565 WARN [org.hibernate.jpa.internal.EntityManagerFactoryRegistry] [addEntityManagerFactory] - HHH000436: Entity manager factory name (default) is already registered. If entity manager will be clustered or passivated, specify a unique value for property 'hibernate.ejb.entitymanager_factory_name'

Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.gnp>?

Hibernate: select countrybea0_.code as code1_0_, countrybea0_.gnp as gnp2_0_, countrybea0_.name as name3_0_ from Country countrybea0_ where countrybea0_.code=?

Aug 13, 2015

Spring Hibernate Workspace creation

While developing any complex system it is advisable to break the entire project in to small (probably) reusable components and as and when required include one (maven) project in another. In this blog lets create eclipse workspace for with following frameworks. This workspace will basically serve the purpose of developing core components of any project. We will use World Database made available by MySQL instead of running around to create some hypothetical application.

Spring (core, context and orm)
Hibernate
Spring Data JPA
slf4j with log4j (for logging and flexibility to move from log4j to any other logging mechanism)
JUnit
MySQL Driver (assuming most of your projects will need same DB, if there are more DB then load all drivers)

So first create a maven project in eclipse.

Now edit pom.xml as follows.

There are few points here

Use dependency management tag to ensure all project and sub-projects extend this core project and they use same version of libraries rather then loading different version of libraries for different project.
All the libraries that you may possibly use in projects (or sub-projects) at later stage need to have entry in <dependencyManagement> in pom.xml of this core project.
Under dependencies use only libraries that are part of this project e.g. Spring integration is not required in core project then do not include it in core project

Once this workspace is created in next few articles we will build a sample code to use Spring, Spring integration, Spring batch and Spring Data JPA frameworks.

Aug 8, 2015

Best Practices for Writing a batch job

Batch jobs are like unsung heroes. They run in middle of the night when no one (almost no one except that one support guy) is watching them. They take all the load (system resources) and do all hard work but people remember them only when they fail (oooops too cheesy..lets come to the point).While writing any batch job there are certain aspects that any system analyst need to take care of. Following are few aspects that I found quite important. Note that this in addition to have proper document about the job written.

Transactions: First and foremost always plan your transactions properly. When you are loading records in million you need to make a choice do you want to commit for every record or you want to insert all records and single commit. If you are using hibernate then set its property for batch operations correctly. If it is JDBC use addBatch and executeBatch properly.
Logging: Logging is always required. Ensure that all important steps are generating enough information in the log file. Ensure that proper logging level (trace, debug, info, warn or error) is used while logging the information. If there are Exception condition ensure that the message logged provides sufficient information instead of generic information like "Error occurred while getting data from server."
Log file for each run: Ensured that the job creates a separate log file for each run. In a way this help to compare the logs generated and at a quick glance any abnormalities can be noticed.
Externalising configurations: Changes are bound to happen so try to externalise anything that can vary over the period of time e.g. IP Address/Hostname/ID/password of database connection, URL to be hit for retrieving some information etc. Since jobs written once runs for years last thing one would expect to recompile the code just to change configuration.
Multiple entry points for the job: If your code contains some information that may needs to be regenerated or certain piece of code needs to be run stand alone then ensure that existing code has some entry point for this and this entry point is documented properly for an example Main Job, Part of the job to generate encrypted password and part of the job just to check if all configurations are correct or not.
Flexibility to rerun the job: There are chances that when the job is running there will be some exceptional conditions and certain part of the job is not executed e.g. data is loaded from legacy system in rational DB but reports that needs to be generated from DB are not generates since file system ran out of space. In similar scenario you do not want to run the entire job again and you may want to run only a part of the job. But at the same time as an architect you may also want to consider a case where some tables are loaded partially and remaining data was not loaded so how best the business case will be? Do you want to delete all data that was loaded in failed job or you want to start from last point of failure? This is one of the most important aspect for any batch job.
Flexibility to run the job for particular period: There will be a time in a life time of a batch job when job did not run for certain period of time (say it failed for couple of days and no one was able to fix) and hence there will be a loss of data for this period. Considering this scenario always ensure that your job has flexibility to run for specific start date to end date. No would like to run a job for a particular date for number of times.
Housekeeping of log files and Data: If your job is generating logs for each run half battle is won but when your job is running for months and years it keep creating logs and it will keep eating space on your NAS or on your servers. You need to clean it. So ensure that you have proper mechanism to archive your logs. At least move them in a zip file every month end job. Same holds true for data loaded in tables (or tables generated in DB). Once you don't need them move them or archive them.
Status update of the job: In most of the cases there will be a job depending on status returned of the other job. If not when you are designing your system but highly possible some time at later stage of the project this needs to be done so always ensure that your job returns (notifies) the status of the job run. Few options are generating zero byte success or failure file, sending MQ message, updating some flag in DB and so on.
Flag when it fail
Where the Data comes from and where it goes: Person who is designing this job need to understand end to end flow e.g. Who is providing this data? In what all formats? What will happen if data sent once is lost, can we get the same message/file again? If so, how? Who consumes data this job has loaded? What is sanity check at downstream for data this job has stored?