Architect Series, Data Architecture, Large Data Volumes

The Data Architecture & Management Designer Certification – My notes & thoughts

After pushing it back for a while now, I finally managed to take out time to prepare and pass the Salesforce Data Architecture and Management Designer Certification (WOO-HOO!!). This also came with a bonus, as passing this certification also got me my “Certified Application Architect” credentials (WOO-HOO X 2!!), taking my certification count to 14 😊

Thanks to the awesome network on LinkedIn and the amazing #SalesforceOhana, I am getting tonnes of congratulatory messages, so a big thanks to this fantastic platform and the ever-growing Salesforce community. You guys rock!!

I wanted to thank you all by sharing what I learnt as part of my prep for this exam, the resources that I referenced and some of the key topics that I would like you all to stress on as part of your prep. I would also like to mention that there are some excellent blog posts written by the experts on this topic (the one from Gemma Emmett being an extremely good one that I also referenced during my revision, you truly are WONDER WOMAN!), and this is my attempt to give back to the community. So here we go:

  1. My first go-to resource for any certification is the Official Study Guide. I use the headers in the guide to structure my notes, as it helps ensure that I have covered all the nitty-gritties and areas that can potentially come in the exam
  2. Useful videos that I recommend you watch (you can view them after you are done with all the reading and are well versed with all the topics)
  3. If someone asks me a list of topics that one should focus on for preparing for this exam, this would be my laundry list:
    1. Writing efficient SOQL
    2. Indexes (all about them)
    3. Skinny Tables (all about them)
    4. Force.com Query Optimizer
    5. Data Skewing – Parent-Child Skew, Ownership Skew (there is a whole Cheat sheet that you should refer to)
    6. Parallel Recalculation
    7. Deferred Sharing Maintenance
    8. Granular Locking
    9. Selective/Non-Selective Queries
    10. On-Insert Events (and how to turn them on/off during loads to maintain Data Integrity)
    11. Extraction tools, including:
      1. Data Chunking
      2. PK Chunking
    12. Parallel Processing v/s Serial Processing
    13. Batch Apex
    14. Archival
      1. In-Place
      2. External
      3. Hybrid
    15. Backups and which one to use when
      1. Full
      2. Incremental
      3. Partial
    16. Backup Optimization – Vertical and Horizontal
  4. An excellent resource to kick-off your prep would be the document on Deployments with Large Data volumes best practices. This will have you covered all the way, right from the Multi-tenant architecture, table structure behind the scenes, tools & techniques that Salesforce uses to monitor & optimize performance. In my view, this is an excellent document to begin with.
  5. Topics that you should know in & out:
    1. Indexes – You should know all about these bad boys, example:
      1. Standard Indexes
      2. Fields that support/do not support Indexing. Formulas are supported as of Winter 13
      3. Indexing Formulas – Non-Deterministic Formulas – These cannot be indexed, but you should know what Non-Deterministic Formulas are and why can’t they be indexed
      4. Custom indexes to include NULL rows (supported as of Winter 13)
    2. Skinny Tables – Another favorite. Familiarize yourself with:
      1. What are Skinny tables and what do they do?
      2. Fields that can/cannot be included in Skinny Tables
      3. How can you create/update/delete Skinny tables? (You need to contact Salesforce Support)
      4. # of columns allowed in Skinny tables
      5. Copying Skinny Tables to Sandbox (what’s supported, what’s not supported)
    3. APIs – Types of APIs supported. You will find these covered under the Data Backup & Archival related articles. Learn all about them, which one to use where, pros/cons etc:
      1. SOAP API
      2. REST API
      3. BULK API
      4. METADATA API
  6. Read on everything about Data Skewing. Get to know about the issues that you can run into, owing to bad design decisions resulting in Parent-Child Skew, Ownership Skew etc. and what should you do to avoid or handle such scenarios. There are several articles that talk about these topics along with examples and how to address them in the Official Trailmix
  7. Tools & Techniques to avoid Ownership Skew – What tools does Salesforce offer in order to prevent locks when you are loading data, or making large scale changes to data ownership, sharing & visibility:
    1. Parallel Sharing Recalculation
    2. Deferred Sharing Maintenance
    3. Granular Locking
  8. Further, you should be familiar with the following tools/techniques. I remember encountering several questions covered on these topics,  especially Event Monitoring & Performance Dashboards, so make sure you give them a good read
    1. Force.com Query Optimizer
    2. Query Tool
    3. Event Monitoring
    4. Performance Dashboards
  9. Make sure you go through the EXTREME DATA LOADING SERIES. It is an extremely well written 6-part series, and covers pretty much every aspect around Large data volume deployments, whether it is to do with data prep, sequencing, suspending events at the time of data loading, extracting etc. I cannot stress enough upon the importance of this series, so do not miss it at any cost!
  10. Large Data Volumes and Batch Apex – Another important topic to cover. Make sure you are aware of the recommended batch architecture when you are dealing with large number of records. This is very well covered in the article – Force.com Batch Apex and Large Data Volumes
  11. Read up about SOQL and SOSL, what’s the difference between the two, and which one to use where.
  12. Additional areas that I encourage you to cover are as follows:
    1. Visual Force Best Practices
    2. Duplicate Management
    3. Data.com

I admit that I did not spend too much time on these topics (honestly, I totally skipped Data.com) and thankfully I did not see much questions coming from them (maybe I was lucky!). But I encourage you to read through them before you take the exam.

That’s what I had to cover from my prep. It took me almost 2+ weeks to go over everything that I documented above (in addition to my daily project activities), spending around 2-3 hours each day. Be prepared to do a lot of reading, and also be prepared to see content being repeated across topics. But be patient, because you will realize that it’s all worth it!

Hope you find this helpful. All the best for your prep. Keep learning & keep sharing!

Author: Anup Arora

Leave a Reply

Your email address will not be published. Required fields are marked *