Considering pages when caching a JSON response
Social networks work on the basis of presenting a feed and allowing the user to interact with the content contained in the feed in (hopefully) novel and interesting ways that the user derives usefulness (joy?) from. The presentation of the feed varies greatly from social network to social network but the underlaying model is often very similar to:
The Feed being the set of the total information, if you are using a RESTful API this will often take the form of an endpoint url: https://greatsocialnetwork.com/api/v2/news
The Page being the subset of the information contained within the feed and will often be actual the response from the endpoint.
The Post being the individual piece of content that the user will actually see and will be contained within the Page.
Coupled with this common data model is the concept of caching
. Caching is were is locally store data to allow for quicker retrieve, we could can cache the result of calculations or duplication of data stored else where such a on server. The trade off associated with caching is that we gain quicker retrieval time at the expense of accuracy (or up-to-dateness). Depending on your data set this will impact how long you can store and present potentially out of date data to the user. In the example app that we are going to build we don't care much for accuracy of individual posts but we do care that we present a full set of pages and that those pages are in the correct order without gaps.
Stacking the questions
I'm going to use Stackoverflow as the example of a social network, as it's API follows the model described above (and you don't have to register to start to using it 😃). Let's look at the Question's endpoint, the URL for this is https://api.stackexchange.com/2.2/questions?order=desc&sort=creation&site=stackoverflow - the response from this endpoint contains an array of questions (described as items) such as:
{
"items":[
{
"tags":[
"ios"
],
"owner":{
"reputation":55,
"user_id":5024892,
"user_type":"registered",
"accept_rate":42,
"profile_image":"https://www.gravatar.com/avatar/1ed5d5eae3802a3f8ce37d09233505ec?s=128&d=identicon&r=PG&f=1",
"display_name":"TaroYuki",
"link":"http://stackoverflow.com/users/5024892/taroyuki"
},
"is_answered":true,
"view_count":28,
"answer_count":1,
"score":-1,
"last_activity_date":1455221065,
"creation_date":1455220954,
"question_id":35348969,
"link":"http://stackoverflow.com/questions/35348969/convert-to-double-failed-if-the-value-is-null-or-empty",
"title":"Convert to double failed if the value is null or empty"
}
],
"has_more":true,
"quota_max":300,
"quota_remaining":296
}
Paginating on this endpoint is as simple as adding &page=n
where n
is whichever page you are wanting to retrieve.
Armed with this API, data model and caching priority we will build a system that will build up a paginated data set focused on ensuring that the user isn't presented with gaps in that data set. If a gap is spotted the app should delete the non first/top page of data and retrieve fresh pages in sequence - the delete should happen at the earliest possible moment. So using this requirement we will parse this JSON response into a set of NSManagedObject
subclasses and present these in a tableview using a FetchedResultsController
. The app will request the first/top page of the feed when the user opens the app and the next page of data when the user reaches the end of cached pages by scrolling the tableview to it's end.
The example below works for a chronologically ordered set of data.
Modeling
The first thing to consider is that we need a way to determine if we any gaps in our pages. Following the data model described above I propose that we build the following model:
- a
Feed
class that will store a set of pages - a
Page
class that will hold a reference to the feed it's in and also a set of pages - a
Question
class that will hold a reference to it's page
Ok, so we have a basic data model but this doesn't allow to identify if we have gaps in our cached pages. In order to support this we need to add more state to both the Feed
and Page
classes. Let's add an arePagesInSequence
boolean property that will us to query if the feed contains pages that are in sequence or if it has gaps in it's model. The arePagesInSequence
property will be a calculated value that we will determine by examining the response that we retrieve from API requests. When paginating we don't care if the data is in sequence or not as we are adding new pages of data to the end of our feed driven by the users downward scroll on the tableview as we are building on what we have so shouldn't have any gaps. We only need to concern ourselves with determining if the pages are in sequence when we request the first/top most page.
In order to do we need to examine the pages of Questions
that we are returned; each Question
has it's own unique ID. Using this information we know that if we get back a page that contains a question that we already have then the pages are in sequence and if we don't then we need to treat our feed as being out of sequence (it's possible to get back a page of unique IDs but the feed is actually still in sequence however for simplicity in this example we will ignore this scenario). In order to do we can compare if a page has the same number of questions after the JSON has been parsed as was present in the JSON response. Let's store this value in fullPage
boolean property, we need this property as the last page can contain less questions than the maximum number of questions available for the page size. After parsing each page we can then query fullPage
on the newly parsed page and set the arePagesInSequence
property on the feed instance.
Next we need to consider how to store the URL for the next page in the pagination sequence. If we store it in the feed, we will need not only the next-in-sequence URL and end of feed URL. You may be thinking 'but don't we delete out-sequence pages?', and while this is true we only want to delete the out-of-sequence pages when the user is not interacting with any of those pages i.e. when the user is on the first/page of questions. So it's entirely possible (and very much probable) that the app that need to store the URL for the out-sequence pages and the in-sequence pages. This has the potential to be a headache! But if we think about our pages a linked-list we can see that it shouldn't be the Feed
that stores the next URL but rather the Page
, this way when we request the next page of data it's just a case of passing in the proceeding the page and asking it for the next URL - we don't care if that request is filling a gap or adding a page to the end/bottom of the feed. Let's add a nextHref
property to our Page
class.
Enough chat, let's see some code
Feed:
NS_ASSUME_NONNULL_BEGIN
extern NSString * kPTEBaseURLString;
@interface PTEFeed : NSManagedObject
@property (nonatomic, strong, readonly) NSArray *orderedPages;
+ (PTEFeed *)questionFeed;
+ (PTEFeed *)questionFeedWithManagedObjectContext:(NSManagedObjectContext *)managedObjectContext;
@end
NS_ASSUME_NONNULL_END
#import "PTEFeed+CoreDataProperties.h"
PTEFeed+CoreDataProperties:
NS_ASSUME_NONNULL_BEGIN
@interface PTEFeed (CoreDataProperties)
@property (nullable, nonatomic, retain) NSNumber *arePagesInSequence;
@property (nullable, nonatomic, retain) NSSet *pages;
@end
@interface PTEFeed (CoreDataGeneratedAccessors)
- (void)addPagesObject:(PTEPage *)value;
- (void)removePagesObject:(PTEPage *)value;
- (void)addPages:(NSSet *)values;
- (void)removePages:(NSSet *)values;
@end
NS_ASSUME_NONNULL_END
Using the newer approach to splitting NSManagedObject
subclasses into one concrete class (PTEFeed
) that the developer can add to and one category (PTEFeed+CoreDataProperties
) that is generated for us. Here we have the arePagesInSequence
property we spoke about above and the class also contains a few convenience methods for retrieving one particular instance of a feed and an ordered array of pages.
Page:
NS_ASSUME_NONNULL_BEGIN
@interface PTEPage : NSManagedObject
// Insert code here to declare functionality of your managed object subclass
@end
NS_ASSUME_NONNULL_END
#import "PTEPage+CoreDataProperties.h"
PTEPage+CoreDataProperties:
NS_ASSUME_NONNULL_BEGIN
@interface PTEPage (CoreDataProperties)
@property (nullable, nonatomic, retain) NSDate *createdDate;
@property (nullable, nonatomic, retain) NSString *nextHref;
@property (nullable, nonatomic, retain) NSNumber *index;
@property (nullable, nonatomic, retain) NSNumber *fullPage;
@property (nullable, nonatomic, retain) NSSet *questions;
@property (nullable, nonatomic, retain) PTEFeed *feed;
@end
@interface PTEPage (CoreDataGeneratedAccessors)
- (void)addQuestionsObject:(PTEQuestion *)value;
- (void)removeQuestionsObject:(PTEQuestion *)value;
- (void)addQuestions:(NSSet *)values;
- (void)removeQuestions:(NSSet *)values;
@end
NS_ASSUME_NONNULL_END
Both nextHref
and fullPage
are present as spoken about above.
Question:
NS_ASSUME_NONNULL_BEGIN
@interface PTEQuestion : NSManagedObject
// Insert code here to declare functionality of your managed object subclass
@end
NS_ASSUME_NONNULL_END
#import "PTEQuestion+CoreDataProperties.h"
PTEQuestion+CoreDataProperties:
NS_ASSUME_NONNULL_BEGIN
@interface PTEQuestion (CoreDataProperties)
@property (nullable, nonatomic, retain) NSString *title;
@property (nullable, nonatomic, retain) NSString *author;
@property (nullable, nonatomic, retain) NSDate *createdDate;
@property (nullable, nonatomic, retain) NSNumber *index;
@property (nullable, nonatomic, retain) NSNumber *questionID;
@property (nullable, nonatomic, retain) PTEPage *page;
@end
NS_ASSUME_NONNULL_END
Nothing much here related to the our pagination approach other than that questions are associated with a page
.
Ok, so thats our data model complete - lets look how we retrieve the JSON that we cache and how we populate the properties declared in our model classes.
Retrieving
To retrieve our JSON response we will work with three groups of classes:
- a
QuestionsAPIManager
class that will hide the details of API call being made and the details of how that API call is made. - a
PTEQuestionsRetrievalOperation
class that will handle processing the JSON response on a background thread. - a
PTEQuestionParser
class that will actually parse the JSON response.
By abstracting the actual API call behind QuestionsAPIManager's interface we can handle configuring the URL that will be used without the ViewController needing to care about the details instead the ViewController.
+ (void)retrievalQuestionsForFeed:(PTEFeed *)feed
refresh:(BOOL)refresh
completion:(void(^)(BOOL successful))completion
{
NSURLSession *session = [NSURLSession sharedSession];
NSURL *url = nil;
if (feed.pages.count > 0)
{
PTEPage *page = [feed.orderedPages lastObject];
url = [NSURL URLWithString:page.nextHref];
}
else
{
NSString *urlString = [[NSMutableString alloc] initWithString:kPTEBaseURLString];
url = [NSURL URLWithString:urlString];
}
NSManagedObjectID *feedObjectID = feed.objectID;
NSURLSessionDataTask *task = [session dataTaskWithURL:url
completionHandler:^(NSData * _Nullable data, NSURLResponse * _Nullable response, NSError * _Nullable error)
{
dispatch_async(dispatch_get_main_queue(), ^
{
PTEQuestionsRetrievalOperation *operation = [[PTEQuestionsRetrievalOperation alloc] initWithFeedID:feedObjectID
data:data
refresh:refresh
completion:completion];
[[PTEQueueManager sharedInstance].queue addOperation:operation];
});
}];
[task resume];
}
In the above method we pass in a refresh
parameter to trigger either a refresh (first/top page) or pagination request. We then make the actual API call and pass it's response onto the operation to be processed. An interesting aside is that when scheduling the operation we switch onto the main queue/thread this is because when we call the block that we pass through to operation will be executed on the thread that it was called on but we will see better in the operation itself.
The operation handles serializing the NSData
returned into an NSDictionary
, triggering the parsing of that serialized NSDictionary and updating the parsed Page's properties within the context of the Feed.
@interface PTEQuestionsRetrievalOperation ()
@property (nonatomic, strong) NSManagedObjectID *feedID;
@property (nonatomic, strong) NSData *data;
@property (nonatomic, copy) void (^completion)(BOOL successful);
@property (nonatomic, assign) BOOL refresh;
@property (nonatomic, strong) NSOperationQueue *callBackQueue;
- (NSNumber *)indexOfNewPageInFeed:(PTEFeed *)feed;
- (void)reorderIndexInFeed:(PTEFeed *)feed;
@end
@implementation PTEQuestionsRetrievalOperation
#pragma mark - Init
- (instancetype)initWithFeedID:(NSManagedObjectID *)feedID
data:(NSData *)data
refresh:(BOOL)refresh
completion:(void(^)(BOOL successful))completion
{
self = [super init];
if (self)
{
self.feedID = feedID;
self.data = data;
self.completion = completion;
self.callBackQueue = [NSOperationQueue currentQueue];
self.refresh = refresh;
}
return self;
}
#pragma mark - Main
- (void)main
{
[super main];
NSError *serializationError = nil;
NSDictionary *jsonResponse = [NSJSONSerialization JSONObjectWithData:self.data
options:NSJSONReadingMutableContainers
error:&serializationError];
if (serializationError)
{
[self.callBackQueue addOperationWithBlock:^
{
if (self.completion)
{
self.completion(NO);
}
}];
}
else
{
[[CDSServiceManager sharedInstance].backgroundManagedObjectContext performBlockAndWait:^
{
PTEQuestionParser *parser = [[PTEQuestionParser alloc] init];
PTEPage *page = [parser parseQuestions:jsonResponse];
PTEFeed *feed = [[CDSServiceManager sharedInstance].backgroundManagedObjectContext existingObjectWithID:self.feedID
error:nil];
page.nextHref = [NSString stringWithFormat:@"%@&page=%@", kPTEBaseURLString, @(feed.pages.count + 1)];
page.index = [self indexOfNewPageInFeed:feed];
[self reorderIndexInFeed:feed];
if (self.refresh)
{
feed.arePagesInSequence = @(!page.fullPage.boolValue);
}
[feed addPagesObject:page];
/*----------------*/
[[CDSServiceManager sharedInstance] saveBackgroundManagedObjectContext];
}];
/*----------------*/
[self.callBackQueue addOperationWithBlock:^
{
if (self.completion)
{
self.completion(YES);
}
}];
}
}
#pragma mark - PageIndex
- (void)reorderIndexInFeed:(PTEFeed *)feed
{
NSArray *pages = feed.orderedPages;
for (NSUInteger index = 0; index < pages.count; index++)
{
PTEPage *page = pages[index];
page.index = @(index);
}
}
- (NSNumber *)indexOfNewPageInFeed:(PTEFeed *)feed
{
NSNumber *indexOfNewPage = nil;
if (self.refresh)
{
indexOfNewPage = @(-1);
}
else
{
indexOfNewPage = @(feed.pages.count);
}
return indexOfNewPage;
}
@end
In the above class we determine the next page URL by using the count of the pages that we have cache and determine if the feed's pages are in sequence.
The parser itself is pretty standard.
@implementation PTEQuestionParser
#pragma mark - Parse
- (PTEPage *)parseQuestions:(NSDictionary *)questionsRetrievalReponse
{
PTEPage *page = [NSEntityDescription cds_insertNewObjectForEntityForClass:[PTEPage class]
inManagedObjectContext:[CDSServiceManager sharedInstance].backgroundManagedObjectContext];
NSArray *questionsReponse = questionsRetrievalReponse[@"items"];
for (NSUInteger index = 0; index < questionsReponse.count; index++)
{
NSDictionary *questionResponse = questionsReponse[index];
PTEQuestion *question = [self parseQuestion:questionResponse];
question.index = @(index);
if (!question.page)
{
[page addQuestionsObject:question];
}
else
{
page.fullPage = @(NO);
}
}
return page;
}
- (PTEQuestion *)parseQuestion:(NSDictionary *)questionReponse
{
NSUInteger questionID = [questionReponse[@"question_id"] unsignedIntegerValue];
NSPredicate *predicate = [NSPredicate predicateWithFormat:@"questionID == %@", @(questionID)];
PTEQuestion *question = (PTEQuestion *)[[CDSServiceManager sharedInstance].backgroundManagedObjectContext cds_retrieveFirstEntryForEntityClass:[PTEQuestion class]
predicate:predicate];
if (!question)
{
question = [NSEntityDescription cds_insertNewObjectForEntityForClass:[PTEQuestion class]
inManagedObjectContext:[CDSServiceManager sharedInstance].backgroundManagedObjectContext];
question.questionID = @(questionID);
}
question.title = questionReponse[@"title"];
NSDictionary *ownerResponse = questionReponse[@"owner"];
question.author = ownerResponse[@"display_name"];
return question;
}
What's important to note for our pagination approach is:
if (!question.page)
{
[page addQuestionsObject:question];
}
else
{
page.fullPage = @(NO);
}
Here we work if the page is a full page or not by checking if the question already exists in our cached data - fullPage
defaults to YES
Tidying up after ourselves
So we have now seen how we retrieve, parse and store data but we still need to look at how we delete pages when the feed goes out of sequence. For this we want to trigger the delete action when the user has scrolled onto the first page of questions and can no longer see the out of sequence - thankfully as we are using a tableview to present our questions we can implement some option UITableViewDelegate methods to control this.
- (void)tableView:(UITableView *)tableView didEndDisplayingCell:(UITableViewCell *)cell forRowAtIndexPath:(NSIndexPath *)indexPath
{
if (self.fetchedResultsController.fetchedObjects.count > indexPath.row)
{
if(!self.feed.arePagesInSequence.boolValue)
{
PTEDeleteOutOfSyncQuestionPagesOperation *operation = [[PTEDeleteOutOfSyncQuestionPagesOperation alloc] init];
[[PTEQueueManager sharedInstance].queue addOperation:operation];
}
}
}
In the above method we check if the feed's pages are in sequence and trigger the deletion of older pages.
And that is other pagination approach completed, phew! I know I included a lot of code in this post but there is actually a lot more in the example repo here - if you fancy looking at a few more examples.
What do you think? Let me know by getting in touch on Twitter - @wibosco