Feeds:
RSS
Atom

In the previous article we talked about TYPO3 template functions. I mentioned that  substituteMarkerArrayCached is a function that developers should not use. In this article I am going to explain why.

As you remember there are four “substitute” functions for use with TYPO3 templates:

  • substituteMarker
    This function substitutes a single marker
  • substituteMarkerArray
    This function does the same as above but for many markers in the array
  • substituteSubpart
    Substitutes a single subpart
  • substituteMarkerArrayCached
    Our today's case.

The first two functions substitute marker and marker array. The third substitutes template subpart. The obvious missing function is the one to substitute subpart array.

substituteMarkerArrayCached takes several arguments, three first are most interesting to us. It takes template, marker array and subpart array. It means that this function works almost like the function we just found missing. In other words, it can substitute subpart array:

$content = $this->cObj->substituteMarkerArrayCached($template, array(), $subPartArray);

But there is a catch.

The catch is in the “Cached” word in the function name. This function caches its results and tries to reuse them next time. Caching happens in the cache_hash table.

It does not look bad from the first glance. It may even speed up the web site.

Does it really?

Imagine a web site with articles. Let's say 10000 articles. Each article has comments. Suppose that average number of comments is 10 per article. So, if comments use substituteMarkerArrayCached, it would lead to 100000 records in cache_hash table. Now imagine of each comment requires two calls to substituteMarkerArrayCached...

You can say: “Big deal! Disks are large and cheap now!”. Yes, they are. But it is not a disk space. It is MySQL who will suffer.

MySQL works very well with indexes. If it can locate a record using index, the speed can be fantastic. Autoincrement fields make a good index. This is a typical case for most tables in TYPO3. But cache_hash uses MD5 value as index. MD5 values look like random data for MySQL. They do not make good database indexes. So you can be easily out of luck with this table when think about indexes.

Another problem lies in a large number of records in this table. MySQL uses indexes only if it believes that it can save time by using them. If MySQL thinks that using index will not result in much performance gain, it will revert to full table scan. Imagine ful table scan on 100000 row table. Wouldn't non-cached version of substituteMarkerArrayCached be faster for a two entries in $subPartArray?

Unless you have really huge subpart and marker array, I recommend to avoid using substituteMarkerArrayCached at all. It can make performance better for really large substitutions but not for smaller ones. PHP code often runs faster than a single database query (especially if query goes over network).

You can replace a call to substituteMarkerArrayCached with the following code:

$content = $this->cObj->substituteMarkerArray($template, $markers);
foreach ($subParts as $subPart => $subContent) {
    $content = $this->cObj->substituteSubpart($content, $subPart, $subContent);
}

This is how comments extension is going to work starting from the next version.

Summary

substituteMarkerArrayCached is bad because it stores results in and fetches results from cache_hash table. This causes lots of records and slower execution for a simple substitutions. So do not this function unless your substitutions contain a lot of data.

Like it? Then bookmark it! digg.comdel.icio.usgoogle.comMyLink.deYahooMyWebTechnoratiFurllive.comnetscapeTagThatWebnews

6 Comments

  1. on Wednesday, 16-07-08 10:33 Herms
    Sad but true. Probably that is what causes problems with the extension "cal" from the web empowered church by now. It almost blew my database with 4GB in cash_hash table within a few days. IO waits at 90%, a slow to unavailable web site. And all that after staticly including "cal" as minical on all pages in the sidebar.

    My database specialist has not had any kind words about primary key with varchar and md5 hashes. Full table scans on that huge data set are insane.
  2. on Wednesday, 16-07-08 17:35 Dmitry Dulepov
    I think newer cal versions work better with cache. I did not try myself but I heard that Mario did some changes there.
  3. on Thursday, 17-07-08 16:01 Dmitry Martynenko
    I think TYPO3 lacks some content caching functions for use in plugins. This is example of misleading function.

    Caching is need to reduce computation and database queries as much as posible. But this function do caching after query data from DB and do all post processing. Is it reasonably?

    We do such caching in memcached. It is good idea for your example about 10000 articles records.
  4. on Tuesday, 22-07-08 00:50 Fabian
    Interesting Dmitry, thanks for explaining this.
  5. on Wednesday, 30-07-08 22:46 Ernesto Baschny
    Hi Dmitry,

    the information in this article is not really correct, if I remember well.

    substituteMarkerArrayCached won't cache the content of the substitution, but the "splitting" that needs to happen for filling in markers. So if you have:

    - one template
    - one "set" of markers + subparts
    - several different content to be filled in those markers/subparts

    you will get only ONE entry in the cache_hash.

    This will speed up processing of those several contents, because it won't need to do the preg_split every time in your template to fill in the content.

    The problem in "cal" is that the amount and order of markers / subparts that will be passed on to substituteMarkerArrayCached is "dynamic": it will fill those which are non-empty. So you end up getting lots of similar entries in cache_hash.

    I already discussed this with Mario at T3DD08 with some ideas on how to solve it (basically make sure every call to substituteMarkerArrayCached the same set of markers/subparts, even if empty, are passed), but I am not sure if it was solved yet or if this was really the problem.

    Cheers,
    Ernesto
  6. on Friday, 29-08-08 13:07 Franz Koch
    I've done some changes in cal concerning the caching issues. I've mainly rebuilt the method 'substituteMarkerArrayCached' with a non-cached version of it.
    We can't make sure that there are always the same markers handed over to the function, because cal allows to add own markers in a very flexible way.

    One thing that TYPO3 is really lacking, is the possibility to place the same plugin with the same piVar-namespace one the same page, one being USER and the other being USER_INT without triggering a no_cache.

Leave a Reply