Things to look on

An list of features we should look on with criticism.

  • Database (MySQL, PDO, SQLite)
  • Template
  • Folder structure
  • Mvc (Model-View-Controller)
  • Bundles (User, CMS, etc..)
  • Intl

Why would a project like engine need to look upon it self and it’s structure…
Well, to keep a project like this in line. Sometimes you need stop and look up see if you a where you wanna be. In this case we are not exactly where we would like to be, but we are getting there slowly and steadily.

Now for the fun part, why the above mentioned features is listed.

Database:
The database layer has some features that no longer is needed, the MySQL driver is an old and nearly deprecated (see: http://www.php.net/manual/en/function.mysql-connect.php)  So there’s no longer the need for that driver within engine. What about PDO? should we actually support PDO, my money says no mostly because i don’t really like PDO. But more impotently PDO is unnecessarily slow!  SQLite is kind a of a special feature, SQLite is not the most widely used database but it actually it’s a pretty  nice file based database. Just for that, it deserves a spot on our supported list.

Template:
Well, the template engine it not very mature, quite frankly this feature is just waiting for a rewrite. There is so many things so a total re think of the design. Maybe we need to take a look on other template engine’s and maybe we will get a nice idea for a new design.

Folder structure:
The folder structure has been talked about back and fourth to night, and for nice an easy implementation of some new features such as bundles (This subject will be further described later).

MVC (Model-View-Controller):
The current implementation of MVC is working, but it has some very strong limitations such as, static routes and the ability to bind specific templates and controllers to a url. The new implementation of MVC will probably  look some thing like this

<?php
Url::GET('some/url/gues/[numeric]/[chars]',function($id,$name){
$this->template = 'template';
$this->controller = 'yourController';
});
?>

In the above example the method “GET” is just the request method send to the url, this means that you can have the same url doing different stuff for each request method. If the user sends a http post request to the above url, you might not wan’t to render a the page but only insert some data and then redirect. But if it’s GET request you would most likely render the page.

Bundles:
This will be a new concept introduced  into engine, bundles in engine will be a work as small packages. These small packages can then be installed via. the new command line tool (more on that later). Bundles main goal is to separate all nonessential  features from the engine core, there by making engine easier to extend.

Intl or Internationalization (as Kalle likes to call it):
Is a concept that will be used to make multiple languages available in engine, not as an translator but as a way to identify a specific language for a set of phrases depending on what language the user has chosen.

Posted in Engine, Products | Leave a comment

Translatable Exceptions

So after some more drafting on the uploading system, I sort of ran into a minor thing, regarding exceptions and my plans to also overhaul the internationalization component.

So, translatable exceptions, currently exceptions uses a static based error message unless the ‘\Tuxxedo\Exception\Translated‘ is used, which is an exception version of the ‘\Tuxxedo\Intl::format()‘ method which works quite well for specific cases, but in cases like the template compiler exception (‘\Tuxxedo\Exception\Compiler‘) or the new to come upload exception (‘\Tuxxedo\Exception\Upload‘), this won’t work so well as they uses a state based error type, to overcome this I thought of a more unified way to achieve internationalization without changing the API too much, let’s have a look at the simplistic design and code changes it will cause.

Translatable

Exceptions thats using a state based exception (exceptions that have multiple error messages containing meta data), currently uses a ‘baked-in’ error message, meaning it is not translated for internationalization. Engine only uses ‘baked-in’ error messages at boot time, meaning when Engine hits an unrecoverable state in the start up process, for account for cases where the internationalization component would not be loaded.

Since the compiler is not specific to only running at a specific environment, and the upload one is gonna be a runtime component, this needs to be changed, while also keeping the ‘baked-in’ error messages for cases where internationalization is not used, like the datamanagers currently do.

So how can we achieve this? The most lightweight option, is to use an interface plug-in which will be put into the ‘\Tuxxedo\Design‘ namespace, since all classes located in the ‘\Tuxxedo\Exception‘ namespace is throw-able. This interface will require exceptions to implement a method called ‘getTranslated()‘ or similar, which will be able to get the current loaded system translated version of the exception, or a parameter can be implemented if its a not too much hassle on the system to get the error in a custom translated message, although I doubt about this last one as it creates some overheat that I don’t think Engine needs “just” to be able to do that.

Example

Let’s take a quick example; the system loaded language is danish, but since the compiler exception uses this translatable interface, we need to call the translatable version, which works, in current draft, as a new exception instance that simply is ‘previous’ exception in PHP terms, how this is gonna be implemented is gonna be discussed below, now on with the example:

 ...

use Tuxxedo\Exception;
use Tuxxedo\Template\Compiler;

try
{
        $c = new Compiler;

        /* Missing </if> at condition #1 - Parse error */
        $c->setSource('<if expression="true">');
        $c->compile();
}
catch(Exception\Compiler $e)
{
        /* Fejl: Syntaks fejl, manglende </if> i betingelse #1 */
        echo 'Fejl: ' . $e->getTranslation()->getMessage();
}

...

Exception implementation

So notable in the example, what if the translation is ‘english’?  To follow the egoistical logic of Engine, it is up to the programmer to not use the method as a type of generalization but only when it is needed. However if Engine does not detect a language change, meaning if the system language is ‘english’, then Engine will simply return ‘$this‘, otherwise a virtual populated instance of the same exception type is created.

Wait a minute, a virtual populated instance? So the statically built-in message is simply just translated and thats it? Yes, thats pretty much the current concept. I do realize that it does not seem like a clever solution, and more like a hack as of current blog draft, but bear with it as many things designed while writing blog entries, have changed in a way thats incompatible with the one posted here at the blog

 

So for now, stay tuned for more data about this subject, along with more about the uploading factory thats currently in the works, caio!

 

Posted in General | Leave a comment

Uploading, what?

Howdy

So it’s been a while with no new technical post here at Tuxxedo, but fear not, I’ve dedicated this whole post to talk about some technical theories and material that currently are being worked on.

So as the title says, this post is dedicated towards uploading files, its a huge missing feature within the Engine API, and a very common task for many applications. This post will go over some of the concerns about implementing an API that is simple on the outside and complex on the inside, being robust with security in mind.

A little background

Some years ago, I think it was a cold and dark January night in 2006, I wrote my at that time first basic script that could upload files from my personal computer to my website, in an attempt to venture into an area that I had not yet worked with in the web world. At the same time at a website I was staff at, there was a huge request for having image uploaders, so the next night I turned my script into a simple program that could upload images, named KalleLoad (how original!).

Development of this program progress rapidly and I added many new features to it, although the version I worked on never really finished when I stopped supporting the program, it is the version that the same old domain (which still hosts the original files from 2006 btw), still running 1.5.0 Gamma 2-dev. In the 1.5 series I re-thought many of the concepts regarding the upload mechanics, I was however not able to implement a clean and flexible API due to my at then time idea that KalleLoad had to support PHP4.

So what does this mean? This means that its a long overdue and time to realize some of these concepts and ideas within the Engine namespace.

Features

There are a range of features that this API will support, some of these were already previously supported but were poorly implemented or not usable without dirty hacks which invalidated their status as an API.

Multi upload: Support was half way implemented in KL 1.5, but never completed. By redesigning the class that handles the controls, we can implement a smart ‘queue’ a like system with a clean API using the \Tuxxedo\Design\InfoAccess pattern.

...

use Tuxxedo\Upload;

/* URL to a file somewhere, in this case the Google logo */
$url = 'https://www.google.com/images/srpr/logo3w.png';

/* Create the queue, with no resource handlers */
$queue                    = new Upload;

/* Create the batch of files to upload (Named value/Value >= Type) */
$queue['fileinput']       = Upload::TYPE_INPUT;
$queue[$url]              = Upload::TYPE_URL;

/* Process the upload */
foreach($queue as $result)
{
     /* $result is now an object of \Tuxxedo\Upload\Result */
     /* Result objects contain information about the transaction */
     /* Including detailed errors, other information such size too */
     printf('Processing file \'%s\'...', $result['filename']);
}

...

So before there goes too much into that example, lets explain the basic idea of what is going on here:

  • $queue is initialized to an instance of \Tuxxedo\Upload, with no resource handlers
  • $queue is populated with 2 items to upload, from a form (<input type="file" />) and a URL
  • Iterating $queue starts processing the queue batch, meaning they will be uploaded
  • $result from each iteration contains detailed information about a particular file, including errors and such

The first one is a sort of no brainer, resource handlers will be explained later in this post, once the basics are understood.

Types are a way of telling the underlaying code layer how to actually process the input, and what to look for, since theres a relatively big difference in terms of how you would process a file from a form to the idea of fetching a remote file using streams to gather the relevant information needed before a file may be accepted for transfer, in this case, both input and URL’s have very different ways in the PHP world to get the same information, like MIME type, size etc.

The internal implementation of types, at least the URL handler, is using an abstract streaming layer, so that extensions to the Engine can make use of it to implement support for unsupported protocols, could for example be for attachments from a mail using IMAP. However the inner workings of this stream API is not gonna be covered more in this posting.

Processing, this is perhaps the most part thats not so obvious, since you would probably expect a method or similar to tell the loop (or even the programmer) that we’re processing the queue now, no body wants to write code we really do not need, or can gently skip in an “obvious” way once we understand the underlaying base idea. While the foreach loop will cause the queue to be processed (of non processed items), the ->process() method will be available for single item processing, or for what I would call “‘ol fashioned way” of doing it.

The last part of this feature is the result that is returned from processing, either from the loop or from calling a queue procession manually. This object does not only contain useful information gather about the file, but also controls that can be used if the generic resource handler is used, or if any of the resource handlers permits manual override of actions to a file, an example of such an action could be moving the file from the temporary upload cache to a custom folder that might not be what the resource handler decided, if the resource handler for example was written for a specific application and the values were sort of ‘baked in’.

Naming: Naming of files, KalleLoad have used two different naming algorithms, 1.0-1.4 used an md5 hashed string and 1.5 used a random alpha numeric character sequence for both the file name and upload folder (if generated). They can be expressed as:

EBNF Diagram

Simple right? I was to implement these two algorithms plus an additional new one, while the ‘original’ is more for the retro of KalleLoad which will at least turn into something more randomness in terms of how the hash is calculated. The ‘classic’ one is really simple and straight forward and suits most tasks of creating a unique filename.

Some people like to name their pictures, whether it is an id of some sort, or just ‘gag’ we can use this to creating a unique filename as well, which is what I want to introduce a third naming algorithm, called ‘logical’, it will use ‘logical’ potions of the original filename to name the new file, while it sounds simple to skip out certain non ASCII characters for safe keeping and keep it alpha numeric with casing support, there is still quite a guessing game to figuring out a suitable name, let’s break it down a little:

  1. Strip out any non ASCII characters, like Unicode, since we want a ‘simple’ file name, this creates an issue for writing styles such as arabic, meaning files named as such will not be able to pass this first step of the test.
  2. Separate ‘words’ or ‘phrases’, using a space (” “), hyphen (“-”) and underscore (“_”) as separators
  3. Based on how the internal algorithm will be put together, should it include one, or multiple ‘words’, if so, concatenate them using either a hyphen or underscore as separator
  4. Prefix (most likely) or suffix the concatenated string for unique-/random-ness

So say we got a file named ‘AFUP05 209405.jpg‘, the algorithm could break it down to something like ‘1685_afup05-209405.jpg‘, the ’1685′ is the prefix, so theres no naming collisions for safe keeping in this case. I personally like this method because it is a bit more expressive than some ‘completely’ random string.

Resource handlers: these work as a term of Add-on, these can be written to a specific application, which lets the add-on do much of the business logic, which means that each time a queue is processed, it doesn’t need to be moved manually (like in the above code example comment).

So what can Resource handlers do? Anything, a resource handler have almost complete control of each stage in a processing a transaction queue. A resource handler is programmed much like datamanager adapters are, with a way of instructing Engine to only invoke the resource handler at certain breakpoints defined, for example if we imagine a ‘KalleLoad’ resource handler being implemented, in order to for fill the same requirements as the current website have we will need to invoke at the following breakpoints:

  1. On file type (the MIME type) to determine whether it is considered valid
  2. On naming, although this could be optional due to the naming algorithms implemented
  3. On storage, leaving the resource handler fully in-charge of moving the file into the correct location on the file system from its temporary location

With some clever hooking, this can be achieved quite easily without any overhead. I’ve not yet fully decided if I wanna use the Event handling approach which is currently available in the trunk version of Engine, but I fear that the way that the way resource handlers will work, with all this control is more than what the Event handler system was designed for, but I’m sure I can come up with a logical way to achieve this with some of OOP magic.

Supplying a resource handler must happen using each queue instance:

 ...

use Tuxxedo\Upload;

$queue = new Upload;

$queue->handler('\KalleLoad\API\UploadRsrcHandler');
$queue->handler('\Application\UploadRsrcHandler');

...

I’m still not sure if it is wise to have the ability to have more than one resource handler per queue instance at the time, since it creates all sort of complexity, such as needing to signal state changes to each of the resource handlers, so the other one doesn’t try to ‘hijack’ a file over another, which resource handler that has superior priority, and such, it generally creates more issues than what it solves in my mind, which is why I think if the above example is executed, the current resource handler would be ‘\Application\UploadRsrcHandler

Error handling

Now that we’re covered some of the major feature concepts behind the API, it is time to quickly dwell into the how we deal with errors. The API supplies just one new exception type, named ‘\Tuxxedo\Exception\Upload‘, this type works the same as that of the template compiler, by states. Each state may supply additional meta information that can be used to make error messages more expressive.

While exceptions are used in places where an ‘unrecoverable’ error occurs, meaning the API cannot continue function, could be a programmer mistake, network outtake or something that causes it to basically blow up. Handling errors per file transaction is another thing that is more silent, if you like, each object returned by the processing method, whether its from the loop or its from calling the actual method contains detailed information about a transaction, and at what state it is at, which means unless a resource handler picks up such errors, it is up to the programmer him-/her-self to pick up and deal with them:

 ...

$failed = Array();

foreach($queue as $result)
{
      /* Check for errors */
      if($result->state == Upload::STATE_ERROR)
      {
            $failed[] = $result;

            continue;
      }
}

if($failed)
{
      echo 'The following files failed to upload:';
      echo '<ul>';

      foreach($failed as $result)
      {
            echo '<li>' . htmlspecialchars($result['filename']) . '</li>';
      }

      echo '</ul>';
}

...

This is still subject to change, but the base idea is to leave error handling to the developer if the resource handler doesn’t take over, at least for now, a more automated approach can always be looked at, but for now we good with this approach I think.

Round up

I think I covered many items here for now, I still got some more things I want to add regarding security and how all that blends into the whole picture along with more information about how I intend to design the stream API, including prototypes of handler inferfaces and how the resource handler hooks might work internally with more code than what was in this small blog posting.

I hope you enjoyed reading this, and don’t forget there is more to come once I get the grasp of a proper idea and tried some different techniques out, stay tuned!

Posted in General | Leave a comment

Backend updates

From the release of this post and until late September, the server will undergo some major changes.

Be aware that downtime may occur at any point during this timeslot.

It will be done in 2 stages;
Stage 1
The first stage will proberly cause the most downtime of the two stages. All services on –  and all connectivity to - the server will periodicly be unavailable while the changes occur.
The first stage will focus on changes in the OS.

I expect the server to be back in “normal” conditions around mid August.
From here and untill stage 2 we will check for any issues there might be.

Update: The remaining stage 1 work should only cause minor downtime.
We have enabled kalleload uploads again, for now.

Stage 2
In this stage there will be some updates to the webserver and php installation directly.
The major purpose of this update is to implement ipv6 connectivity again. We have the ips, now we just need the webserver to support it.

The planned update is scheduled to be implemented in the first half of September. This will also allow for testing of the update and integration of the changes from stage 1.

We will do what we can to minimize the downtime.
Kalleload will have uploads disabled during the stages.

 

/Peter Emil Henriksen

Posted in Server related | Leave a comment

1.1: Datamanagers

First post on 1.1 changes, as much as I’ve talked about them in the past, then datamanagers become more and more an integral part of the core of Engine. In 1.1 I’ve spend quite some time on optimizing the API and making sure that each manager validate input correctly and syncs it correctly.

New datamanagers

Originally 1.0 shipped with just a few datamanagers:

  • session
  • style
  • user
  • usergroup

In 1.1, more datamanagers have been added to accompany the newly added tools in the DevTools application:

  • datastore
  • option
  • optioncategory
  • permission
  • template

While these new ones doesn’t cover all the remaining abilities of the core framework, then these help make existing ones more feature-complete, like the ‘template’ datamanager thats connected to the ‘style’ for example.

As mentioned, not all things can be controlled by datamanagers, yet. In 1.1, 2/3 of all the things that couldn’t be managed by datamanagers, can now be. What still is missing, by intention in 1.1 is:

  • language
  • phrase
  • phrasegroup

These have been left out for 1.2, due to the big focus 1.2 will have on internationalization.

Hooks API

Previously in 1.0, datamanagers came with 1 somewhat baked-in hook that datamanagers could choose to implement, called APICache (API – Prefix, Cache – Name), while only the location of hooks have changed for this particular one, two new have found its way to the core and the API that executes hooks have been made public, so applications that chooses to override it, can do so.

The new ones that have been added in 1.1 is:

  • Virtual: Calls a method named ‘virtual’, with the field, and the data for that field
  • VirtualField: Calls a method named ‘virtual<Field>’, with the data for that field
  • Resetable: This method is baked in, as it tells the datamanager base, that a datamanager may reset it self to its original state.

While these work essentially as the ‘Cache’ hook, then they should differ in terms of their functions.

Validation

Validation in 1.1 of fields have been rewritten from scratch and all methods have been well tested to make sure that consistent data can be passed to the datamanager.

While this is just a first step towards the direction, I personally wanna go for, then this suits the 1.1 series much better, as some of these rewrites have decoupled the logic between the previous ‘Filter’ component (now named ‘Input’) and made the datamanager base more self contained. For example, the ‘callback’ method no longer exists in the ‘Input’ class, nor does the ‘STRING_EMPTY’ constant.

Another feature, that I believe is really powerful and causes less code writing, is the new validation constant ‘VALIDATE_IDENTIFIER’, this makes the datamanager base automatically validate the identifier to ensure unique keys in the database, remain unique whilst causes less validation code to be written for each datamanager.

Contexts

While this probably should have been in the ‘Hooks’ section, then I think its notable to mention contexts. Contexts are a way telling the datamanager where we are now, and what we are executing, which makes hooks less complicated to design and implement by making the datamanager deal with logic for each context internally.

Currently theres 4 contexts:

  • CONTEXT_NONE – Indicates that we’re not in a special method, could be validation of callbacks for example
  • CONTEXT_SAVE – Called in save()
  • CONTEXT_DELETE – Called in delete()
  • CONTEXT_VOID – Indicates that this datamanager instance now is void and cannot do anything (after a successful delete for example)

While you may be wondering while there is no ‘CONTEXT_RESET’ as mentioned the new Resetable hook, this is because it is a hook that requires each datamanager that implements it, implement this method, meaning that its only called in one context hence why no context in the base.

 

Hope this gave a bit of insight of datamanagers and what they bring in 1.1,
-K

Posted in General | Leave a comment