Options for depersonalizing sensitive production data for use in your test environments
Introduction to the data masking solutions
For companies that build software, their systems have evolved to follow a set of best practices for development. These generally include the following.
There are a variety of data masking tools or data obfuscation tools on the market, and it's worth discussing the evolution of these solutions over time to compare each tool properly. Each generation has improved on the one before, and it is important to consider how added features and functionality impact the security and usability of a company's sensitive data.
The second and third best practices conflict with each other. By using production data in test and development environments, you are exposing sensitive data to software developers and testers. But, this being said, the ability to use production data in test environments is so compelling that the third best practice often takes a back seat to the second. For this reason, many companies are now being pressured to prevent the exposure of sensitive data to their testers and developers. This often comes in the form of privacy legislation from governments, or through industry regulatory organizations such as the Payment Card Industry (PCI) Security Standards Council.
IBM offers two solutions to eliminate the inherent conflict within these best practices. The solutions extract data from production and depersonalized it while still maintaining its realism for high-quality testing. The IBM products refer to this process as data masking. National IDs look like national IDs, names look like names, addresses are valid addresses, but all the data is no longer sensitive because all the personally identifiable information (PII) is now fictional. The two solutions that do this are as follows.
Data Masking Software Reviews 2016
The first products collectively referred to as the InfoSphere Optim products in the rest of this article, are licensed differently, but the underlying technology is the same. So, if both solutions solve the same problem, the natural question is, 'What's the difference?' This article attempts to answer that question by looking first at the common core functions shared by the products, and then by examining the differences between them. Technical in nature, the aim of this article is to provide a guide for customers who are faced with the decision of which product to purchase, and which features of the products to use after they have identified the need to mask their data.
Data Masking Tools ListTest data masking essentials – The commonalities
Before examining the differences between the solutions, let's look at the fundamental functions that both provide. These functions are common between the solutions because they are all required to populate test environments with realistic, but fictional, data that can be used for testing. Many of these commonalities stem from the fact that both solutions use almost identical sets of data masking algorithms.
Masking algorithms
Some PII data follows a strict format and pattern. These fields include items such as credit card numbers, US Social Security Numbers, Canadian Social Insurance Numbers, or Brazil's Cadastro de Pessoas Físicas. Because the values follow a set of rules that determine their validity, they can be generated using an algorithm. Both solutions provide functions to mask credit card numbers from all of the major issuers, and National IDs from a variety of countries.
You can download your favorite Aashiq Banaya Aapne Remix song from our Mp3 tracks Database. Aashiq banaya aapne song download pagalworld. We don't upload or host any files on our servers. MP3Skulls is a search engine like Google, Bing, Yahoo, SoundCloud, songspk, mp3skull, mp3juices, InstaMp3, tubidy, 4shared & zippyshare.
There is another set of PII fields that also follow a strict format but the values allowed are more flexible. One example is email addresses, where every address has a user name, a domain name and a '@' symbol. Both solutions provide functions to generate new and valid email addresses. In addition, an algorithm is available in both solutions to detect the format of data and replace the value with a new value of the same format. For example, it would detect the position of the space, numeric, and alphabetic characters in a Canadian postal code L6G 1C7, and replace the values with a generated L3R 9Z7, all without you specifying the format ahead of time.
Masking lookup functions
There are some PII fields that cannot be easily generated by an algorithm. These are things like first and last names, or postal addresses. For these, both solutions have lookup functions to look up values in pre-populated tables that contain things like names and addresses. The index of the value looked up to replace the original is either chosen randomly or by hashing an input value. Hashing an input value is performed in order to maintain consistency when masking.
Some of these data masking functions are shown in Figure 1.
Also, have the latest tools which help you in designing your photos. Get it and give the shape of your dream to the reality.
![]() Figure 1. A sampling of the Optim data masking algorithms for the data masking solutionConsistency
Both solutions have masking algorithms that are designed for consistency. No matter when the masking process is run, the same values will result if the input values are the same. This is very useful in re-populating or adjusting your test data sets without breaking existing regression tests that may rely on certain values being present in the test environments.
Referential integrity
Some values that are masked are located in multiple tables, and the applications that are tested rely on the values between those tables being the same. Both solutions are designed to allow you to mask values and then propagate the results to the other tables.
Customization
Both solutions allow customers to build custom transformation functions to extend the ones that come with the products. The InfoSphere Optim Data Masking option for Test Data Management allows you to build new data privacy functions using column map exits in C/C++, or by creating scripts in the Lua language. DataStage can be extended using C/C++ or BASIC in Transformer Stages, or by creating custom operators in C/C++.
Data movement
Both of the solutions extract data, mask it, and then place it into a destination environment. Even so, how they move data is very different. The differences in the movement of data are the focus of the following section.
Solution differences
This articles has discussed how, in terms of data masking functionality, both solutions offer a similar set of functions. Both can mask data so it's no longer sensitive but still realistic. Both allow you to do this while maintaining consistency between data masking processes and referential integrity between tables. Both move data from production, mask it, and then place it into a target destination. The following section will examine what makes each of them special.
The InfoSphere Optim products
With the InfoSphere Optim products, there is a version for System z and a version for distributed systems. Because the distributed version was modeled on the System z version, aside from how they deal with IMS and flat file data, they look fairly similar.
Both the System z and Distributed Infosphere Optim products mask data that has been placed in files. The extraction process results in an extract file. The data is then masked using a convert process. At that point the masked extract file is then sent to destination environments. If a load request is built, load files are generated from the masked extract file and sent to the loader utility of the database in question. Figure 2 shows you this process.
Figure 2. The masking process for the Optim Masking for TDM product
The InfoSphere Optim products work their best when incorporated into a larger Test Data Management initiative rather than performing data masking alone. The solutions operate on what's known among InfoSphere Optim practitioners as the complete business object, which is a list of tables and relationships between those tables that define one end-to-end business process. Both Optim solutions are specifically designed to extract enough data for your test environments, and no more. It does this by traversing the relationships in the data and picking up related data elements. The Related topics section has an article that explains the complete business object more fully.
A recent development for the InfoSphere Optim products is that the design time tooling has been completely updated and reworked into an Eclipse-based component called the InfoSphere Optim Designer, which is shown in Figure 3. At the same time, a web-based management framework has been constructed. Having a web-based interface that is separate from the design interface allows you to more easily give test data users control over when and how their test data is refreshed.
Figure 3. Applying data masking policies using the Optim designer
In summary, in terms of data selection for data movement, the InfoSphere Optim products are similar to surgical tools dissecting test cases from production. This is not to say that the Optim solutions cannot handle large amounts of data (they can and have in the past), but they have extensive sub-setting capabilities and weren't built with the same bulk cellpadding='0' cellspacing='0' summary='A Comparison of the three IBM Data Masking Solutions'>FeatureInfoSphere Optim data masking option for test data management (Distributed and IBM for z/OS)InfoSphere DataStage Pack for data maskingRealistic masking algorithmsYESYESConsistent masking across systems and time periodsYESYESMaintain referential integrityYESYESCan be customizedYES (C, C++, or Lua for Distributed. Assembler, VS COBOL II, PL/I, C, or Lua for z/OS).YES (C/C++/BASIC)Comes with externally callable data privacy functionsYESNOWorks with native database load utilitiesYESYESPipelined processes (reduced masking server I/O)NOYESWorks on the concept of a complete business object (allows for efficient subset creation)YESNOBuilt for symmetric multiprocessing (SMP), clustering, grid deployments, and massively parallel processing (MPP)NO (but, there is some SMP support).YESHeterogeneous data source support (see the Related topics section for lists of platforms)YESYES
Conclusion
This article has explored the primary functions that are required for a data masking solution. These include extensive data masking algorithms that not only mask the data, but do so realistically while maintaining referential integrity in the data, and consistency over time and between databases. It also discussed how these functions are present in both IBM solutions for data masking: The InfoSphere Optim Data Masking option for Test Data Management, and the InfoSphere DataStage Pack for Data Masking.
The article then discussed the differences between the solutions. The InfoSphere Optim products excel at surgically removing small amounts of data for masking. The InfoSphere DataStage Pack for Data Masking was built on top of DataStage, an enterprise class ETL tool that excels in parallelism and scalability. Finally, the article discussed the use of the data masking API provided by the InfoSphere Optim products. Using these allows you to provide you own data movement engine, and can give additional flexibility for masking your data in non-production environments.
Acknowledgements
Downloadable resourcesRelated topics
CommentsData Masking Software Reviews For Windows 10
Sign in or register to add and subscribe to comments.
![]() Data Masking Software Reviews For Free
Data Masking Suite- simple to install, flexible, and self-explanatory. Create test data and mask sensitive data. Data Masking Suite contains all you need for fast and reliable Data Masking.
Orpheus Data Masking Suite is exceptionally easy to use. Try the free version of Data Masking Suite. Most business departments use our software without requiring the assistance of their IT department. Data Masking Suite supports any data source like Excel or CSV files, Access or SharePoint lists. You can also access any ODBC data source (MS SQL Server, Oracle, IBM DB/2, MySQL, ..). You can overwrite existing data or can insert the masked data into a separate new table or file. You can obfuscate names, address information, phone or credit card numbers, finance information or any other type of sensitive data before they are sent via e-mail, to other departments or outside the company's firewall. Some example scenarios: a) Mask and sanitize test data for outsourcing / development partner: Many companies outsource parts of their software development or testing activities. With Data Masking Suite you can filter and mask all data you need to provide your external development and testing partners with everything they need. b) Quickly build your own demo systems by masking data from customer projects: Demo systems and customer presentations do not allow you to use the original data from previous customers. Data Masking Suite allows you to obfuscate the original data so that you can use it for your own demo systems. c) Sanitize data that is restricted by law not to leave the country's border: Some of our customers collect information from distributed subsidiaries. Personal information has to be masked, so that this information can be aggregated to existing data without specific personal information 'I want to know how many people did something (again) and not who did it.' What do you need to know about free software?From Orpheus:
Data Masking Suite- simple to install, flexible, and self-explanatory. Create test data and mask sensitive data. Data Masking Suite contains all you need for fast and reliable Data Masking.
Orpheus Data Masking Suite is exceptionally easy to use. Try the free version of Data Masking Suite. Most business departments use our software without requiring the assistance of their IT department. Data Masking Suite supports any data source like Excel or CSV files, Access or SharePoint lists. You can also access any ODBC data source (MS SQL Server, Oracle, IBM DB/2, MySQL, ..). You can overwrite existing data or can insert the masked data into a separate new table or file. You can obfuscate names, address information, phone or credit card numbers, finance information or any other type of sensitive data before they are sent via e-mail, to other departments or outside the company's firewall. Some example scenarios: a) Mask and sanitize test data for outsourcing / development partner: Many companies outsource parts of their software development or testing activities. With Data Masking Suite you can filter and mask all data you need to provide your external development and testing partners with everything they need. b) Quickly build your own demo systems by masking data from customer projects: Demo systems and customer presentations do not allow you to use the original data from previous customers. Data Masking Suite allows you to obfuscate the original data so that you can use it for your own demo systems. c) Sanitize data that is restricted by law not to leave the country's border: Some of our customers collect information from distributed subsidiaries. Personal information has to be masked, so that this information can be aggregated to existing data without specific personal information 'I want to know how many people did something (again) and not who did it.' Comments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |