Generally, we set the data into a joint table, this helps us to show data without processing. Tables containing collective data can be named as flat table and the managing procedures can be named as indexing. Here in this article we come to know how we can speed up the processing and displaying without adjusting the standards of database normalization.
Quick guide for the contents in the article
- Anomalies of a non-normalized database.
- Requirements for Data Indexer.
- Event Oriented Indexing.
- Magento Indexing.
Anomalies of a non-normalized database
We mainly come across three types of anomalies, which can be listed as under.
- Modification anomaly.
- Insertion anomaly.
- Deletion anomaly.
Modification anomaly is found when an attribute is present in multiple tables. In this scenario modification takes place at different location. If not modified properly, it will lead database to inconsistency.
We cannot insert a row when there is a missing piece of information. This results to loss of information and insertion anomaly occurs.
If the required data is deleted this leads to deletion anomaly.
Thus to avoid anomalies, one should organize the database according to the database normal forms.
What are the Requirements for Data Indexer?
Indexer speeds up the data display. The limitation of indexer is that it does not contain basic data but processed data. Thus it fails in the first level of normalization test. Hence it needs to match certain requirements.
Following are the two main requirements.
- Deletion of flat table from the system should not cause any anomaly.
- The system should work without flat tables.
The primary prerequisite can be met if just the indexing method deals with the table and we erase the table before each indexing.
What happens in the event that we have erased such information that would be required?
We can’t delete such information, the flat table is not part of the framework, and it just displays them. The system should work without its support.
Illustration: We form the flat table; however we delete records meanwhile, which is not required. For this situation, the information in the flat table is not forward any longer. Now when we reindex, there will be data that is not required any longer. In this scenario if we don’t delete the flat table, then we have to check if the given values are actually required or not. This would be a long and complex procedure. The best solution is to delete and rebuild the whole table. Later on this action will be supportive.
Speed will be obtained automatically as the table contains only the most required data. Now all you need is to manage and update regularly.
- Indexing procedure.
The diagram above sates the following steps:
- PHP asks for the records to be indexed, from tables and through various models.
- MySQL request for data.
- PHP creates the records with the help of cycles, usually with multi cycles.
- PHP sends back the records one by one.
- MySQL upgrades them.
Advantage: Simple and logical.
Disadvantage: It is very slow. It cannot meet the requirements of flat tables. Practically this is not possible.
- Procedure of indexing
- With the models, PHP creates the SQL SELECT which comes back in the flat table
- MySQL executes this SELECT and forms the table in the database
Advantages: It is very fast. Avoids the loss caused during communication between two systems. It also meets the criteria of the requirement.
Disadvantages: It is highly complicated. Thus it may happen, MySql may fail to create the processed data.
What are Event Oriented Indexing?
We can start the indexing process manually with cron. In this scenario reindexing is important as we are unaware of the records being up-to-date.
We cannot avoid reindexing completely as it only contains the required data which helps to manage the speed. On the other hand it is very costly. To minimize the cost one should only use this for certain rows only.
Database must be maintained in such a way that while inserting MySql it should automatically sate the modification if required.
One should be clear with two things:
- At what point index list be run?
- If there is a change in the source table
- Which are the records to be indexed?
- It must be run where the records of the data’s are located.
What is Indexing in Magento?
Creation of Flat table
It must be created in such a way that when inserted, MySQL must recognize whether there is a new or an existing record. Unique indexing is the best solution to it.
Provide a name to the table
Now register the indexer
You can add unique indexes in the installer
By doing this you can get rid of the records generating in multiple locations in the table.
Creation of indexing process.
All we need is to go through three functions.
- runReindex($id) – private
- reindexAll() – public
- reindexById($id) – public
Step 1:- Set the database adapter.
Step 2:- Join the tables by requesting the model
Step 3:- We have to remove all the columns of SELECT to adjust the index table. Then tables are joined for the data which will be required later on.
Example: ORDER ITEM join:
Column names and orders are defined with equivalent to flat table.
Generate the request providing the flat table:
Run and insert the table
Filter the records of SELECT:
Now, delete the content in index table and request the identifiers of all the records and call the runReindex($id) method.
Dispatch the Event.
Only during modification and deletion the data will be changed. Observation is needed for this event as Magento differentiates between index events. It depends on the programmer for which events should be monitored by the indexer.
When the indexer wants to observe an event not dispatched, it can be found in the Magento Core. Now the original class should be overwritten.
Implementing the Magento Indexer
Indexer class monitors and runs the indexing processes. It must be created in the model directory of your module. Extension to this class must be from the Mage_Index_Model_Indexer_Abstract class.
Observation of the event can be done through class array.
Earlier we had declared the model’s events. We can see in the code above the function of the ENTITY constant value found in the class. Now we can identify the model. Following is the methods need to be created:
What is Event recognition and management?
Following is the method
Detect the event and add on the required data there on for the indexing.
How to run Indexing?
The ideal methord is _proccessEvent method.
The target of any system is to achieve the maximum speed. Flash Tables enables fast display and this is highly recommendable. The problem of trafficking in the database system must be tackled. Also the communication must be minimized.